Journal of the 
AMERICAN STATISTICAL 
ASSOCIATION 








DECEMBER 1956 


The Automatic Computer in Industry 
The Effect of Respondent Ignorance on Survey Results 


A Method of Estimating the Intercensal po ssog pt of Counties..... TS ates 
Albert H. Crosetti and Robert C. Schmit: 


Researca on Metropoliten Population: Evaluation of Data . .Otis Dudley Duncan 


Spatial Association with Special Consideration of the Case of Market 
Orientation of Production William Warntz. 


Practical Value of International Educational Statistics Gustave Zakrzewski 


Regression Techniques Applied to Seasonal Corrections and Adjustments for 
endar Shifts Harry is 61 
The Ranking of Variances in Normal Populations .............. H. A. David 
The Condition for Lot Size Production ..Myron J. Gordon and William J. Taylor 
Distributions Possessing a Monotone Likelihood Ratio 
Samuel Karlin and H, Rubin 
Quadratic Extrapolation and 2 Related Test of Hypotheses ..... 
CORRIGENDA 
STATISTICAL ABSTRACTS 
BOOK REVIEWS by R. L. Anderson, F. j. Anscombe, Se 
Arkin, David M. Fred W. Braga, inate d, Robert R. Bush, 
William = Cochran, Bernard P. Cohen, James S$, Coleman, Harold F. Dern, 
ae Owen C. Gretton, Robert 0. Harvey, J. B. Hassler, Bert F. 
Hoodie Thee Neil H. Jacoby, D. Gale Johnson, Howard W. Johnson, 
AW. Menaall S. Mills, Lincoln E. Moses, Gottfried E, Noether, Wil- 


liam N. Parker, Emmett J. Rice, A. E. Sarhan, Robert E. Snyder, Irene B 
Taeuber, Milton E. Terry 


PUBLICATIONS RECEIVED .. 
INDEX TO VOLUME 51, 1956 





VOLUME 51 NUMBER 276 





American Statistical Association 


Organized November 27, 1839, Incorporated 1841 


The American Statistical Association is a scientific and educational organi- 
zation. Its membership ia not confined to professional statisticians but includes 
economists, business ¢xecutives, research directors, government officials, uni- 
versity professors, and others who are seriously interested in the application of 
statistical methods to practical problems, in the development of more useful 
methods, and in the improvement of basic statistical data. Engineers, mathe- 
maticians, biologists, actuaries, sociologists, psychologists, and representatives of 
many other professions are included in the membership of the Association. 


Annvat Duzs* 
Residents of North America.... $ 8 Member’s subscription to Bio- 
OcheGhi ics os vd os Seen ue 5 Metrice. v6 cess. cece ue dives 4 
Student membership. ......... 5 Contributing seddihiondle 
Introductory membership (first Institutional membership, mini- 
dues payment of inembers un- MAU ss ke de 
der 30 years old).....s.s0%-: 5 


* Includes subscription to the Journal of the American Siatistical Association, and to The Ameri- 
can Statistician. 


Subscription rate, $8 per year. Prices tor back issues available on request. A 
cumulative Index to Volumes 1-34, 1888-1939, may be obtained from the 
Secretary. 

For further information. about the Association and membership application 
forms, write the Secretary, 1757 K St., N. W., Washington 6, D.C. 


The present Institutional Members are: 


American Management Association International Harvester Company 
American Telephone éz ha Ladies Garment 
Telegraph Company Worker’s Tnion 
Armour & Company Kennecott Corporation 
Armstrong Cork Company Metropolitan Life Insurance Company 
The Atlantic Refining Company National Analyste, Ine. 
Bell Telephone Laboratories National Associaticn of aca pin 
Chrysler Corporation National Bank of Detroi 
The Columbia Gas System, Ine. National — po ae 
Deere & Compaay Board, 
The Dow Chemical Company New York Telephone Company 
Dun and Bradstreet, Inc. Philadelphia Electric Company 
Eastman ttodak Company Schering Corporation 
The Equitxble Life Assurance Sears Roebuck & Cenacle 
Society of the U. 8. Socony Mobil Oil Company, Ine. 
Sparen oly» awe PY on Standard Oil Company (Indiana) 
General Motors Co i-tandard Oil Company of New Jersey 
Humble Oil and Re ord 
i Brotherhood of Teame rperail 
sters suffeurs, Warehousemen ss 
& Helpers of Anerica United States Steel Corporation 
International Business Western Electric Company, Inc. 
Machines Corp. Young & Rubicam, Inc. 


Published Quarterly by the AMERICAN bey ae cet ieekan 


gel? op RN Office: 1787 K 


Haskell Hall, ve 


wip of Roma, provided 1 in Pr ohiiad hasan tt oe 


Wisconsin. 
 smalling, adress should sow sight wine pation. A cig’ =f the 





JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


The Editors welcome the submission of manuscripts for possible publication. They 
should be typewritten entirely double-spaced, including footnotes, and two copies should 
be sent to the Editor, W. Allen Wallis, 207 Haskell Hall, University of Chicago, Chicago 
37. Books for review should be sent to the same address. Unsolicited book reviews are not 
accepted, but suggestions of titles for review are welcome. 


EDITOR 
W. Auten Watts, University of Chicago 
ASSISTANT TO THE Eprror: Dorotuy D. FisHer 


ASSOCIATE EDITORS 


Sipney 8. ALEXANDER CHURCHILL EISENHART 
Massachusetts Institute of Technology National Bureau of Standards 
Mitiarp Hastay Grorce M. Kuznets 
City College of New York Tniversity of California 


I. RicHarp SAVAGE 
Stanford University 


ADVISORY PANEL OF FORMER EDITORS 
Wim G. Cocnran (1945-50) Frank A. Ross (1929-34, 41-45) 
Johns Hopkins University Thetford, Vermont 


WituraM F, Ocpurn (1920-25) Freperick F. StepHan (1935-40) 
University of Chicago Princeton University 


Corrigenda: Readers and authors are urged to submit to the Editor notices of errors 


found in this or any previous volume. These will be published once a year, in the 
December issue. 





Ww 





ook ee 


NONPARAMETRIC METHODS 
IN STATISTICS* 


By D. A. S. FRASER, University of Toronto, Nonparametric 
methods which have developed so rapidly since the postwar years 
are collected and unified in this book. The author treats nonparametric 
theory as an integrated part of statistics and presents nonparametric 
forms of standard statistical problems for both large and small 
samples. 1956. Approx. 316 pages. Prob. $8.50 


STATISTICAL ANALYSIS OF 
STATIONARY TIME SERIES* 
By ULF GRENANDER, University of Stockholm, Sweden; and 
MURRAY ROSENBLATT, Indiana University. The first comprehen- 
sive treatment in book form of the statistical theory of spectral analysis 


of stochastic processes and related subjects. 1956. Approx. 368 
pages. $11.00. 


BUSINESS FORECASTING IN PRACTICE 
Principles and Cases 
Edited by ADOLPH G. ABRAMSON, SKF Industries, Inc.; and 
RUSSELL H. MACK, Temple University. Six experienced fore- 
casters show the step-by-step reasoning and techniques by which 


they reach conclusions from available data. 1956. 275 pages. 
$6.50. 


PETROGRAPHIC MODAL ANALYSIS 
An Elementary Statistical Analysis 


By FELIX CHAYES, Carnegie Institution of Washington. 1956. 
113 pages. $5.50. 


* Wiley Publications in Statistics, 
Walter A. Shewhart and S. S. Wilks, Editors 


Send today for trial copies. 


JOHN WILEY & SONS, Inc. 
440 Fourth Avenue New York 16, N.Y. 


Please mention the Journal of the Amenican Statistica Association in writing advertisers 





JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 








Votume 51 DecEeMBER, 1956 NumsBeEr 276 
ARTICLES 
The Automatic Computer in Industry. . . . . . . Tsrornton C. Fry 565 
The Effect of Respondent Ignorance on Survey Results a Rospert Ferser 576 
A Method of RATE SEL. the Intercensal Population of Counties : 
. Avperr H. Crosertt and Rosert ©. Scumrrr 587 
Reicks on Metropolitan Population: Evaluation of Data . 
Sion . Ons Duper Duncan 591 
ivesizing Spatial ye with Special Conhddintion of the Case of Market Ori- 
entation of Production. . . . Wriiram Warntz 597 
Practical Value of International Educational Statistics. . Gustave ZaxkrzEwskr 605 
Regression Techniques ‘ to Seasonal Corrections and Adjustments for Calen- 
dar Shifts . . . . Harry Ersenpress 615 
The Ranking of hin 4 in Normal ‘fended . o « « Ve 2 re. ae, ae 
The Condition for Lot Size Production 
. Myron J. Gorpon and Wiiuiam J. Taytor 627 
Distributions Possessing a atunotaill Likelihood Ratio. . 
Samuet Karun and H. Rustin 637 
Quadratic ‘Extrapolation ad a “Related Test of Hypotheses . A. pE tA Garza 644 
CORRIGENDA. 650 
STATISTICAL ABSTRACTS 653 
PUBLICATIONS RECEIVED . 711 
INDEX TO VOLUME 51, 1956, Nos. 273-6 . 713 
BOOK REVIEWS 
Ecxert, W. J., anp Jones, Resecca, Faster, Faster: A mn Description of a oo 
Electronic Calculator and the Problems It Solves. . . Trornton C. F 565 
Savage, Leonarp J., The Foundations of Statistics . . . . . F.J. iinaiuens 657 
NeyMAN, Jerzy, Editor, Proceedings of the Third Berkeley Symposium on Mathe- 
matical Statistics and Probability. Vol. III: Contributions to Astronomy and Phys- 
ics; Vol. IV: Contributions to Biology and Problems of Health; Vol. V: Contribu- 
tions to Econometrics, Industrial Research, and Psychometry cums 3 
Mirus, Freperick C., Introduction to Statistics. . . . . .EpwinS. Mrs 660 
Spurr, WriuiaM A., Keiioae, Lester §., ano Smitu, Jonn H., Business and Eco- 
nomic Statistics. . . . . . Gorrrrren E. Noeraer 660 
Guitrorp, J. P., Fundamental Statistics in  Payehology and Education, Third Edition 
{ Rosert R. Busu 661 
ee, P. an Bg Curso di Eetadtstica. Vol. L: Estadsetica Descriptiva y Modelo 
Matemético . . . . Joree Armas B. 662 
Jounson, Rosert E., ano Monuis Dean | Guide t to Blementor Statistical For- 
mulas ; : ERBERT ARKIN 663 
Watuis, W. ALLEN, AND —— Riker V., Statistica: A 1 New heniea 
; Wriuiam G. Cocuran 664 
Sees F. Scan Ezperimental Designs in Sociological Research, Revised Edition 
4 Haroip F. Dorn 666 
iii wists T. , Experimental Design: ‘Theory and Application i ct Cae 
.R. L. ANDERSON 667 





Sowa, er H., anp LizpeRMAN, GERALD J., Saarioet of Industrial Statistics 
Mitton E. Terry 


Lassie: ati Taomrson, Nellie Landblom’s Copybook for or" in Re- 


search Work 3 A. E, Sarwan 
Devons, E ty, a PPR to British peas Statistics ‘ Sue HvuLTGREN 
Technical Assistance Mission No. 77, nae GEORGE, emer Industrial Cen- 
suses in the United States . . R . Owen C. Grerron 


Riuey, VerA, AND ALLEN, ROBERT “A Nitibielaten Sensei Studies 


Unitep Nations Statistica, Orrics, Methods of National Income Estimation 
ore pice Epwarp C. Bupp 


Go.psmiTH, RaymMonp W., A Study a Saving i in the United States, Vols. I and II 
ay ; ‘Grorce Garvy 


GoupsMiTH, RAYMOND Ww. eee eae s., AND  itinteinbiaee sEN, Horst, 
A Study of Saving in the United States, Vol. Til . . . . George Garvy 


Boarp oF GOVERNORS OF THE FEDERAL ResEeRvE System, Flow of Funds in the 
United States, 1989-19538 . . . é . . . . Emmett J. Rice 


KuinemaNn, Hersert F., Electronics in ere A Case statis in Planning: Port of 
New York Authority . ee” Frep W. Braca 


May, Fuiorence A., Editor, Electronics i in es A Descriptive Bs ay oy Guide 
a. Pas, ei Frep W. BraGa 


TRONS J. E., Urban Mortgage Lending: cot ey mre Markets and Experience. 
Rosert O. Harvey 


pa: cian P., Suburbanization of Service Deteaeited within Standard Metro- 


politan Areas . . . . Davin M. Buanx 
Menta, M. M., Structure a Py Industries Seg Spo! Se Peery 
HorrMaNnn, wasdited G., British Industry, 1700-1950. . .Wii.t1aM N. Parker 


Procunow, Hersert V., Editor, Determining the Business Outlook . 

% jie Sow ee 5 een eee 
FELLNER, enn ‘Trends oad Cycles in Economic Activity . Bert F. Hose.irz 
Joint CoMMITTEE ON THE Economic Report To THE CONGRESS OF THE UNITED 

States, Characteristics ind the Low-Income doce and ait Federal Pro- 
grams. . D. Gatzs Jonnson 
JOINT Chica ON THE ‘See atiedle TO THE Cosine oF THE UNITED 
States, A Program for the Low-Income seats at Substandard Levels of Liv- 
ing; 1956 Report on Economic Statistics. . D. Gate JoHNSON 
Patmer, Guapys L. atenent tiie Workers in a Changing psn e 
. Howarp W. Jounson 
es eile: 0., peo GLENN L. .» AND eos a 8., Editors, Resource 
Productivity, Returns to Scale, and Farm Size . . . .J. B, Hasster 
U. 8. Department or Derense, Mathematical Models af eens Behavior . 
° James 8. CoLeMAN 
BLUMEN, Resmi nae amie) AND McC ARTHY, Suse J., The Industrial 
M obility of Labor as a Probability Process 
. Rosert R. Busu and Bernarp P. Conen 


mosis G., Statistics of Bebb ens PU. ae ee eee E. Moses 
MiLBaNnk Memoria Funp, ome Research in Human Fertility. . 

sek Irene B. TAEUBER 
Sanamt Sees ao Trends asd Differentials in meee rail. wey 
Georce F. Marr 
Sit aintlae \ sehen AND » kien ee S., Mi ation and Mental Disease: 4 


aoe, of First Admissions to Hospitals a Menta — New 7 1939-1941 
SE Ee A. W ” MARSHALL 


FisHER, J. 'W,, AND CLARKE, E. E., The Bown m of Rates of Separations from Men- 
tal "Hospitals. ~pigik Harowp F. Dorn 


669 


671 
672 


674 
676 


676 


677 


677 


682 


685 


685 


686 


688 
690 
693 


695 


696 


698 


698 


699 


700 


701 


702 


704 


705 


706 


708 


709 





1956 OFFICERS, AMERICAN STATISTICAL ASSOCIATION 


President 
Gerrrupe M. Cox 


Henry Scuerre 


CuurcHILt E1iseNHART 
Atrrep N. Watson 


Harry ALPert 
Sysit P. BinpLoss 
Irwin D. J. Bross 
Wituram J. Carson 
Bzssz B. Day 
Lucite Derrick 
Murray DorkIn 
8S. M. Freez 

IRWIN FRIEND 


Board of Directors 
President-Elect 
Wiii1am R. Leonarp 


Vice-Presidents 
Joun W. Tuxey 


Directors 
Lester R. FRANKEL 
W. J. Youpun 


Secretary-Treasurer 
Donan C. Riupy 


Members of the Council 
Joun E. Freunp 

Borp HarsHBARGER 
Virainta T. Hotgan 
Pav. G. Homurir 
Arraor LitTEeLu 

A. J. Jarre 

Wiuuram O. Jonzs 
Wiiwiam H. Kester 
Joun C. McKzn 
Kewnets B. WILiiAMs 


Section Chairmen 


Irwin D. J. Bross, Biometric Section 
Kenneta B. Wruuiams, Business and Economic Statistics Section 
Besse B. Day, Section on Physical and Engineering Sciences 
Feurx E. Moors, Social Statistics Section 
Joun E. Frevunp, Section on Training 


Past President 
Raps J. WATKINS 


Martin R. GAINSBRUGH 


Frank R. GARFIELD 
Joun W. Boatwricut 


Paut MEIER 

Feurx E. Moors 
Grorrrey Moore 
Jack MosHMAN 
Horace W. Norton 
Morris HAMBURG 
Joun R. Stockton 
Conrap TABUBER 
W. Avien WALLIS 





Tables and other aids to computation appearing in this Journal are ab- 
stracted and indexed in Mathematical Tables and Other Aids to Computa- 


tion 


A cumulative Index to Volumes 1-34, 1888-1939, may be obtained from the 
Office of the Secretary of the American Statistical Association. 
Reprints of most articles published since 1945 may be purchased from the 


office of the Secretary. 








EDITORIAL COLLABORATORS 


Forman S. Acron, Princeton University 

M. A. ADELMAN, Massachusetts Institute of 
Technology 

Jack ALTERMAN, Bureau of Labor Statistics 

R. L. ANpERsOoN, University of North 
Carolina 

KENNETH J. 
University 

M. M. Baspar, University of Costa Rica 

T. A. Bancrort, Iowa State College 

JosepH Berxson, Mayo Clinic 

Max A. BersHab, Bureau of the Census 

Coun R. Birra, University of Illinois 

DonaLp J. Bocug, University of Chicago 

Joun V. BREAKWELL, North American Avi- 
ation Company 

K. A. BrowNnuenr, University of Chicago 

Josep G. Bryan, Massachusetts Institute 
of Technology 

JosprpH M. Cameron, National Bureau of 
Standards 

Puiuip J. CLark, University of Michigan 

A. C. Conn, Jr., University of Georgia 

W. S. Connor, National Bureau of Stand- 
ards 

JEROME CORNFIELD, National Institutes of 
Health 

Davip R. Cox, University of North Carolina 

Joun H. Curtiss, American Mathematical 
Soctety 

Curnsert Dante, New York City 

Frank T. Denton, Dominion Bureau of 
Statistics 

H. F. Dopas, Bell Telephone Laboratories 

Tuomas G. DonNELLY, Dominion Bureau 
of Siatistics 

Acugson J. Duncan, Johns Hopkins Uni- 
versity 

Davip B, Duncan, University of North Car- 
olina 

Joun D. Duranp, United Nations 

J. Durpin, London School of Economics 

Meyer Dwass, Northwestern University 

Paut 8. Dwyer, University of Michigan 

Water T. Feprerer, Cornell University 

Karu Fox, Jowa Siate College 

Dona.p A. Garpriner, Oak Ridge National 
Laboratory 

Seymour GEISSER, 
Standards 

Leon Giirorp, Bureau of the Census 

Leo A. GoopMan, University of Chicago 

Joseru ALBERT GREENWOOD, Navy Bureau 
of Aeronautics 

FrANKE.Grusss, Ballistic Research Labora- 
tories 

Suantr 8S. Gupta, Bell Telephone Labora- 
tories 

ROBERT 
College 

JAMES Hannan, Michigan State University 

H. O. Hartiey Jowa State College 

W. C. Hearty Jr., Ethyl Corporation 

Urs W. Hocusrrasser, American Univer- 


ARNOLD, Michigan State 


National Bureau of 


Haver, North Carolina State 


silty 
Wasstty Hoerrpine, University of North 
Carolina 


Ricuarp A. Hornsetu, Bureau of the 
Census 

Danret G. Horvitz, University of Pitis- 
burgh 

Apram J. Jarre, Columbia University 

Howarp L. Jonzs, Illinois Bell Telephone 
Company 

J. H. B. Kemperman, Purdue University 

Oscar KemprHorne, Jowa State College 

Maurice G. Kenpauu, London School of 
Economics 

Jack Kierer, Cornell University 

Lesiie Kisu, University of Michigan 

Avram Kisse.iaorr, Allied Chemical and 
Dye Corporation 


Tsatuine C. Koopmans, Yale University 


WiiuiaM H. Kruskat, U niversity of Chicago 
Joseph Henry Kuuisacx, Washington, 
D.C 


Gera.p J. LrepermAn, Stanford University 

Jutius Liesiein, National Bureau of 
Standards 

Wiiuram G. Mapow, University of Illinois 

Henry B. Mann, Ohio State University 

NatHan Manrte., National Institutes of 
Health 

Eur 8. Marks, National Analysts, Inc. 

Frank J. Massey, Jr., University of Cali- 
fornia (Los Angeles) 

Paut Meter, Falun Hopkins University 

Perry Meyers, Perry Meare Inc. 

Freperick Moste.uer, Harvard University 

Gortrrigep E. Norruer, Boston University 

Epwin G. Ops, Carnegie Institute of 
Technology 

Paut 8. Otmstreap, Bell Telephone Labora- 
tories 

Guy Orcutt, Harvard Universit 

Joun W. Pratt, University of Chicago 

Leon Pritzker, Case Institute of Tech- 
nology 

Howarp Ratrra, Columbia Universiiy 

Herpert Rossins, Columbia University 

Donatp M. Roserts, Stanford University 

Joan R. Rosensuiatt, National Bureau of 
Standards 

Gipgeon RosENBLUTH, Queen’s University, 
Ontario 

Cuar.Es L. Scuuttze, Council of Economic 
Advisers 

ExvizaBetu L. Scort, University of Califor- 
nia (Berkeley) 

E. F. Suerrierp, Dominion Bureau of 
Statistics 

Henry §&. 
Census 

Irvine H. Srecex, Council of Economic Ad- 


Surrock, Jr., Bureau of the 


visors 

Rosert M. Sotow, Massachusetts Institute 
of Technology 

Pau N. SomERvVILLE, American University 

Joun Topp, National Bureau of Standards 

Davip L. Watiace, University of Chicago 

F. E. Wurtworts, Dominion Bureau of 
Statistics 

Marvin ZELEN, National Bureau of Stand- 
ards 





JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Number 276 DECEMBER, 1956 Volume 51 


THE AUTOMATIC COMPUTER IN INDUSTRY* 


THORNTON C. Fry 
Bell Telephone Laboratories 


giants, the new electronic computing machines, and about. their near 
relative, automation. If the admiration is directed at the shrewd analysis of 
needs and careful tailoring of parts to meet them that is necessary if the 
creature is to fit its intended use as well as may be, it is fully warranted. It is 
equally warranted if directed at the inventiveness and engineering skill through 
which systems are constructed which function for hours or days without pause 
or error, though they are built of millions of fallible parts. It is warranted if it is 
in recognition of the additional freedom of scientific investigation which rapid 
calculation provides. I fear, however, that much of the wonderment has been 
less perceptive. It can hardly be otherwise when catchy phrases like “electronic 
brain” are used too freely. 

Faster, Faster is a clear and readable description of one of the largest, fastest, 
and most versatile of these giants, written in language which any intelligent 
man can read. It concerns the Naval Ordnance Research Calculator (NORC 
for short) which the International Business Machines Corporation developed. 
It radiates pride of accomplishment, and the reader is left in no possible doubt 
whose accomplishment is meant. It is splendid publicity for IBM. 

But it is far more than that. As the preface states, electronic calculators “all 
conform to a general pattern, so that anyone who is familiar with one of them 
ean quickly understand another.” Hence Faster, Faster is also good reading for 
those who wish a broad perspective on what modern calculators are and what 
they can do. For this purpose I can recommend Faster, Faster as the best single 
document on computers of which I know. It is technically accurate, its style is 
clear and vivid, it is admirably successful in avoiding both the confusions of 
engineering jargon and unnecessary detail on the one hand, and the confusions 
of half-truths and mystic metaphor on the other. 

This is no mean accomplishment, for there has been a great dea! of con- 
fusion about these giant machines. Take, for example, the word “automatic.” 
In what sense are devices such as NORC automatic? 

In its primary meaning, an automatic agent is one that can do as it pleases. 
In that proper sense, no computing system, except perhaps certain statistical 
devices containing random elements, could be automatic, since the purpose of 


M wc oh ling and ahling has been heard in the land about those prodigious 





* An invited review article on W. J. Eckert and Rebecca Jones, Faster, Faster: A Simple Description of a Giant 
Electronic Computer and the Problems It Solves (New York: McGraw-Hill Book Company, 1955. Pp. xiii, 160. $3.75. 


565 





566 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


all computation is to determine specific, fully conditioned, though unknown, 
numbers. 

Another use of the word automatic is to denote a capacity for repetitive 
operation. “Automatic” machines in a factory, for example, will continue to 
perform a certain task repetitively until either the material gives out, the 
power is shut off, or trouble develops. Such machines characteristically produce 
a succession of identical outputs. They are self-acting in the sense of being 
relatively free from external control; but the built-in controls are so rigid that 
they have no latitude of performance whatever. A computing machine which 
was automatic in this sense would be of no use, since there is no reason to com- 
pute the same answer time after time without variation. 

A third use of the word is for machines which can perform any one of a 
number of tasks—usually not a very large number—and will do so when, and 
in whatever order or combination they are instructed. They resemble trained 
animals at a circus, which respond promptly to the commands of their trainer, 
but which spoil the act if they show a mind of their own. Such devices are not 
self-acting in either of the earlier senses; the external control is now an essential 
and necessary element in their performance. Take as an example an automatic 
typewriter. It can produce a few dozen symbols. But which ones it produces, 
and in what order, or even whether it produces any at all, are completely de- 
termined by signals from some external source. Its output may range from the 
Bible to Bocaccio; unlike an automatic screw machine the product may be of 
almost infinite variety; but this variety is obtained through sequence or ar- 
rangement of a small number of elementary operations. 

I think it is important to realize that when we speak of “automatic” comput- 
ing machines, we are using the word in much this same sense. We will then not 
be so over-awed by the bigness of the Ordvacs, and Eniacs, the NORCs and 
the Univacs. For it is a fact that they are all devices for performing the same 
few very simple basic tasks which smaller machines perform. They are big 
principally because some parts are duplicated many times, and because more 
or less elaborate provision is made to assist the user in giving orders to the 
machine. 

What are the basic tasks which automatic computers perform? We can al- 
most count them on the fingers of one hand. When so instructed they 


(1) Read a number 

(2) Write a number 

(3) Change the sign of a number 
(4) Change its decimal point 

(5) Add two numbers. 


Subtraction is a combination of 3 and 5; multiplication of 4 and 5; division of 
3, 4 and 5; and so on. When such a thing as sin z is needed, it is either read 
directly from a table, or it is calculated by a suitable succession of the five 
basic tasks. Some people will prefer to state these basic tasks in other ways, 
and some may argue for adding such things as “read an instruction,” or “com- 
pare two numbers” to the list. I need not argue with them—for another audi- 
ence and for another purpose I might myself use another classification. The 
point here is that even the biggest machines can do only a very few different 





THE AUTOMATIC COMPUTER 567 


tasks, and that these tasks are very simple. Their ability to solve difficult prob- 
lems resides in the fact that they can be caused to execute complicated patterns 
of these elementary tasks; it resides in the effectiveness of their controls. It is 
not surprising, then, that in these very large machines the computer proper is 
only a small part—seldom more than 10 per cent—of the whole. 

Among the five fundamental operations, reading and writing are the two 
which have to be performed most frequently. We can see why this is so, if we 
consider what the machine must do in the simple case of adding two numbers. 
First it must get the order to add—this is a reading operation. Then it must get 
the numbers—two more reading operations. It must add them. It must receive 
an order where to put the answer—another reading operation. It must put it 
there—a writing operation. Thus there are five reading and writing operations 
for one addition. 

Reading and writing make up such a large part of the total effort of any digi- 
tal computer that its over-all speed is to a very high degree determined by how 
fast it can read and write, and by the extent to which such operations can 
overlap. Of course, in order to read an instruction or a number, one must first 
locate the place where it is written, and before writing one must locate the spot 
to be written on. Such searching is a necessary part of reading and writing, and 
will also have an important bearing on the speed of computations. 

This brings us to another point. If numbers once read were never again 
needed, and numbers once written were never again referred to, the design of an 
efficient computer would be a very simple matter. But in general, this is not 
the case. In scientific problems a number derived at one point in a computation 
may have to be put aside while other operations are performed, and then used 
again, perhaps several times, at later stages of the work. Even in such a rela- 
tively simple matter as payroll accounting, the total of an employee’s preceding 
week’s pay must be brought out each week, and a new total computed, before 
his social security deduction can be determined. Such numbers must be stored 
or “remembered” in order that they may be used again; and experience soon 
teaches, if it is not obvious beforehand, that the facility with which a problem 
can be solved on a given type of machine depends very largely on the number 
of items which it can store for later use. Or, to say this in another and better 
way, how big a machine is needed to meet a particular job is largely determined 
by the number of items which must be stored for ready reference. 

There are many ways to store numbers. A very common one is paper and ink. 
This is the one used when computations are made on an ordinary desk calcula- 
tor. Numbers which the machine cannot retain in its very limited storage sys- 
tem are written down by the operator, and inserted in the machine when next 
required. Other numbers may be taken from printed tables, such as tables of 
sines and cosines, or of empirical data. These are storage systems of a more per- 
manent sort. This form of storage is cheap and almost unlimited in capacity, 
but it is also inconvenient, slow, and liable to error. At the other extreme, 
mnubers can be stored on cathode ray tubes, vacuum tubes, semi-conductor 
diodes, or in quartz or mercury carrousels. Such storage is readily accessible 
and fast—reading and writing times are measured in millionths of a second— 
and it can be made reliable; but because it is bulky and expensive there are 
practical limitations on the amount that can be provided. Between these two 





568 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


extremes there are punched cards, magnetic tape, punched paper tape, mag- 
netic drums, photographic film, and others. These provide larger amounts of 
storage at less cost, but with compensating disadvantages. Generally, the big 
machines have as much of the expensive, quick storage as economics will allow, 
supplemented by larger amounts of the slower but cheaper sorts. NORC, for 
example, uses cathode ray tubes and magnetic tape. Smaller machines have 
less of the expensive kinds in varying degrees; and the very simplest accounting 
machines and desk calculators have almost no internal memory and must be 
supplemented by liberal amounts of ink and paper. 

So much for storage. Now about controls. We have said that computers do 
only a few very simple tasks, and those only under the guidance of detailed 
instructions. Many readers will remember the old, crank-operated Monroe. To 
multiply with it one had to punch a number into it; turn the crank repeatedly, 
thus repeatedly ordering it to add this number; move the carriage at appropri- 
ate times, thus ordering the machine to shift the decimal point; and eventually 
record the answer on paper. Today’s calculators, even the most magnificent 
of them, must get equally detailed instructions from somewhere. They may 
get them directly from the operator, either “live” or more usually as a recording 
prepared in advance. Such direct instructions must cover every detail of opera- 
tion, and obviously require much labor to prepare. However, where the same 
arithmetical operation recurs frequently, so that an identical sequence of in- 
structions must be repeated time after time, human effort may be saved by 
devising a mechanism which will give out this routine. Multiplication and di- 
vision, for example, are such universal operations that modern desk calculators 
incorporate mechanical routines for them. Desk calculators do not have auto- 
matic routines for finding the sine of an angle automatically, but many large 
digital computers do. This may, for example, consist of a table of the sines and 
cosines of 0, 30, 60, 90, 120, - - - , stored on vacuum tubes (this requires only 
two distinct numbers other than 0 and 1), and a built-in routine for finding 
sines of intermediate angles by means of a Taylor’s series. 

Here we meet a peculiar phenomenon. To compute such series requires fre- 
quent multiplication. But surely any machine in which automatic computation 
of sines is incorporated will also have a routine for multiplication. It is un- 
necessary therefore to build deta.led instructions for multiplication into the sine 
routine. Instead, a simple order at appropriate times to start the already avail- 
able multiplication routine will suffice. So we now have three levels of authority 
—at the bottom, the multiplication routine which gives orders to the computer; 
then the sine routine which gives orders to both the multiplier and computer; 
and above them the human operator who can give orders to all. 

So the second important characteristic which differentiates big computers 
from little ones, and one big computer from another, is the number and kinds 
of routines which are built into it. Like Swiss music boxes, such built-in routines 
need only a starting signal to play out their favorite tunes. Like Swiss music 
boxes, they are unprofitable curiosities, unless their special tunes are often 
wanted. 

Suppose now that we have such an automatic computer consisting of an 
arithmetic unit, some storage fast and slow, and some Swiss music box routines. 
What must we do to solve a problem? We must give the necessary orders. This 





THE AUTOMATIC COMPUTER 569 


implies that we must know how to solve the problem. Analog computers may 
sometimes solve problems which their operators would not know how to solve 
without them, but this is never true of digital devices. The same orders which 
they execute could be obeyed with pencil and paper, albeit with more human 
toil and errors and at a slower pace, and would lead to the same result. True, 
the reduction of time and effort may be so large that it will be practicable to 
solve problems which otherwise would not be undertaken; and the reliability 
of machines is better than that of human computation. In these senses, but in 
no other, the machine can do what human beings “could not” do. 

Then we must set down the orders in such detail and in such ferm as will fit 
the characteristics of the machine. There are always alternatives in arrange- 
ments of detail; sometimes also in broad methods of solution. One arrangement 
may require more numbers to be stored than the machine is capable of ; another 
may require much less. With one alternative we may be able to call upon the 
Swiss music boxes to give a large proportion of the orders; with another it 
might be necessary to write them out in full detail. Ingenuity is always an 
asset in formulating these orders. In dealing with new and difficult problems 
deep mathematical insight is sometimes required. 

The form in which orders are put must also be appropriate in another sense. 
The reaction times of human operators are measured in tenths of a second; 
those of computers in thousandths or millionths. If the machine had to adapt 
itself to the pace of the human operator, it would turn out only a small fraction 
of the work of which it was capable. Therefore the large machines require some 
sort of record—punched cards, punched tape, magnetic tape, ete.—upon which 
the operator may inscribe instructions at leisure and from which they can be 
fed to the machine at a much higher speed. 

When we have done these things, we will find the machine a willing slave 
capable of Herculean labors. But for all its prodigious powers, the slave requires 
an intelligent master. 

The content of Faster, Faster is such as to add depth and detail to these 
ideas. In addition to purely descriptive matter, there are chapters on Arith- 
metic and Logic, Instructions and How They Are Executed, Plain and Fancy 
Arithmetic, Checking, Maintenance. There is also a long chapter called “What 
Is There to Calculate?” which gives a glimpse of why large and flexible calcula- 
tors such as NORC are so important to science and to the military. 

I have spoken with enthusiasm both about the message which the book con- 
veys and about the clarity of presentation. There are some things it does not 
attempt to do. It does not set NORC in historical perspective. This, I think, is 
regrettable. For three centuries at least, the art of mechanical computation 
scarcely moved forward at all; what advances there were related to such mat- 
ters of detail as designing machines for production or adding electrical power. 
So superficial were they that only a short generation ago races were run between 
experts with the abacus and experts with electric calculators, without much 
credit to the machines. Then a surge of new ideas and new devices occurred 
here and abroad, and in less than two decades we have NORC and its fellows. 
If today a man with an abacus were given enough work to keep him busy a full 
eight-hour day, and NORC and he were set to work on it by the same start 
signal, NORC would literally finish before the man got started. His normal 








570 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


muscular reaction time is all NORC would require. Computationwise we find 
ourselves suddenly in a new world. The implications for science are consider- 
able; this should be clear from the chapter “What Is There to Compute?” But 
how and why the surge came, the independent and yet mutually stimulating 
contributions here and abroad, the reaction back on the old tabulating machine 
art, and the growing vision of integrated data processing systems almost un- 
fettered by time and space, are outside the plan of the book. Had they been 
included, the spotlight would have been less sharply on IBM, but the book 
would have been an even better document for the intelligent reader interested 
in a broad understanding of the forces at work in our rapidly changing society. 
The story would come appropriately from IBM, not only because of the direct 
contributions of its scientists to electronic computation, but also because its 
president had the vision to support research on computation a decade or more 
before the upsurge began, and subsequently to establish the Watson Scientific 
Computing Laboratory. 

Another subject not touched upon in Faster, Faster is the differentiations of 
computers intended for different specific fields of use. I do not suggest that it 
should. However, many readers of this Journal may be potential users of mod- 
ern computers, and hence interested in this subject, because of its bearing upon 
the selection of a type suitable to their needs. For them, the experience of my 
company as a designer and user of digital systems may be helpful. What follows 
is added for that reason, and not because of any bearing, historical or otherwise, 
on the book under review. 

Interest in digital computers at Bell Telephone Laboratories dates back to 
1937. It began when George R. Stibitz, who was then in our Mathematical 
Research Department, demonstrated how relays, such as are used in modern 
dial central offices, could be used to add numbers. Since this is the basic opera- 
tion upon which all arithmetic computation is based, it followed that relay cir- 
cuits, if properly controlled, could do any sort of mathematical routine. 

Stibitz had a very lively imagination, and in a short time had broadened his 
ideas to include many of the basic features now incorporated in large electronic 
systems, such as control by previously prepared programs, addressed storage, 
and so on. He proposed that such a computer be built, using relays for the 
arithmetic unit, internal storage, and control; and teletype tape for external 
storage and program control. This proposal exhibited great originality for its 
day—the time was years ahead of the earliest electronic computer. However, 
the project was thought to be too ambitious for a first experiment, and what 
was actually undertaken was a much simpler device to serve a known specific 
need. 

In designing filters, equalizers, and other transmission circuits, we had todo 
a great deal of calculation with complex numbers. This was very laborious with 
the standard forms of desk calculators, and we therefore decided to undertake 
the construction of a relay calculator which would add, subtract, multiply, or 
divide such numbers. Such a machine was built, and successfully put in service 
early in 1940. The arithmetic unit was housed in a small closet and connected 
to teletypewriters in three offices where groups of computresses were located. 
A problem keyed up at any one of the three keyboards was solved by the cen- 
tral relay unit, and both the problem and answer were written on the associated 





THE AUTOMATIC COMPUTER 571 


typewriter. This arrangement was highly successful, and was in almost con- 
tinuous use for over eight years. 

On one occasion in 1940 one of the operating stations was taken to Hanover, 
New Hampshire, where the American Mathematical Society was holding its 
national meeting, and connected to the computer over a commercial telegraph 
line. Problems keyed up at this station were solved on the relay equipment in 
New York, and the answers typed out in Hanover. The system was about 1000 
times slower than NORC. Nevertheless the audience was astonished to see the 
answer begin to appear on the typewriter before the final digits of the problem 
had been keyed up. 

By 1940, World War II was near at hand and the manpower of industrial 
laboratories was being rapidly absorbed by defense projects. So the further 
development of digita! computers was put aside. But not for long. One of the 
military projects undertaken by Bell Laboratories was the design of an electri- 
cal director to aim and fire anti-aircraft guns. This was itself a computing de- 
vice of analog type; and it was so superior to previous mechanical types that it 
became the standard equipment for both the American and British armies dur- 
ing the latter years of the war. An ancillary activity was the development of 
suitable means for testing the performance of the directors as they came off 
the production line. For this purpose, gear was developed to feed simulated 
data into them and check the resultant output data against what it ought to be. 
Here a disquieting situation arose. The amount of computing which would be 
required to produce the simulated data and determine the corresponding cor- 
rect gun data was enormous—so enormous in fact that production might have 
to be postponed several months on account of it. At this juncture, Stibitz pro- 
posed that only skeleton data at widely spaced intervals be computed manually; 
and that while this was being done a special relay calculator be built to interpo- 
late data for intermediate points. He believed that such a machine could be 
designed and built, and thereafter could perform the required computations, 
in time to meet the desired production schedule. This is, in fact, what was done; 
otherwise the critically needed M9 director would not have been in use until 
several months later than it was. This was in September, 1943. 

This second relay calculator was much larger than the first, and embodied 
some of the features of automatic program control which Stibitz had visualized 
earlier. It had not been intended as a pe-manent installation, but by the time 
it had served its original purpose other uses appeared, and so with suitable 
modifications from time to time it remained in use at Bell Telephone Labora- 
tories until the end of the war. It was then refurbished and turned over to the 
Naval Research Laboratories, where it is still in use. 

The M9 director development was being closely watched by officers of sev- 
eral branches of the armed forces. They thus came in contact with the relay 
interpolator, and were instrumental in originating requests for Bell Labora- 
tories to construct several relay calculators for other government uses. In re- 
sponse to these requests, four additional computers were built and delivered 
to the National Defense Research Committee in June, 1944, the Naval Re- 
search Laboratories in March, 1945, the National Advisory Committee on 
Aeronautics in December, 1946, and the Ballistic Research Laboratory in 
August, 1947. These were carefully designed for long life and ease of mainte- 





572 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


nance, and the last two especially embodied new ideas as well as the results of 
experience with the older machines. So rapid has been the pace of computer 
development that these computers are already obsolete, principally because 
electronic computers are so much faster. They are still in service, however, and 
are well suited to certain types of computation. In two respects they are still 
unique. One is their fantastic record of accuracy. They are so designed that it is 
substantially impossible for them to commit an error which they do not them- 
selves detect; <. a matter of fact, at the end of 1953 when I last checked on the 
matter, there was only one report of such an undetected error having occurred, 
this case being somewhat obscure. The other is their ability to operate for long 
periods—overnight, or over holidays and weekends—without attendance. 
Both these characteristics come about from incorporating in these calculators 
the same principles of design which are used in dial central office equipment to 
assure accurate and trouble-free operation. 

Since the close of the war our activities in the computer field, aside from 
military devices, have been of three kinds. First, we built a relay calculator, in 
general similar to those built for the government, but tailored specifically for 
use in the design of electrical networks. Unlike the problems of a scientific re- 
search laboratory, these tend to run in definite patterns which frequently recur. 
For example, they deal principally with complex numbers; matrices and con- 
tinued fractions are very common; determination of the roots of high-powered 
polynomials is a problem of every-day occurrence; exponentials, logarithms 
and binary fractions—all with complex numbers—are everywhere. So it is not 
surprising that in the design of this machine provision was made for a very 
high degree of internal control. There were, in fact, over 200 routines perma- 
nently wired into the machine, a few of which are shown in Table 1. By thus 
designing the machine to take advantage of the repetitive character of the 
work, the labor cf coding problems for the machine was greatly reduced. 

This machine was normally attended only eight hours per working day. 
Nevertheless, it achieved the unique record of doing more than 24 hours useful 
work per working day throughout the year. Unattended work on holidays and 
weekends more than compensated for all stoppages for maintenance or other 
reasons. It is now obsolete, having been replaced in 1956 by electronic gear. 


TABLE 1 
SOME INTERNAL ROUTINES OF THE MARK VI COMPUTER 








1) (a + jo)(e +jd) =e + jf 
2) V(a + jb)(c +d) =e +5f 
(a + jb)(c + jd) , 
3 = 
etm +e+m °*? 
S = 2 D. ae 2 
4) X= k(w, w*) (ws w?) cath 
w[l — (w/wa)?][1 — (w/w,)?] 
&) Multiplication of matrices. 
6) Solution of any number of simultaneous linear equations. 
7) f@) = ao + ait + apt? + - + + ant 
Evaluation of f(t) and df(t) /dt for any number of values of t. 

















THE AUTOMATIC COMPUTER 573 


Second, we have developed a system for recording and processing the data 
required to bill telephone customers for the calls they make. I want to describe 
this in some detail, since I shall have occasion to refer to it again. 

When a telephone cal! is to be entered as a separate item on the customer’s 
bill—this is the only case we shall be concerned with—it is necessary to know 
who made the cal] (to know to whom to charge it), its destination (to know at 
what rate to charge), and how long it lasted. It is also important to know the 
date and time of day, especially if the customer questions the charge. Where 
such calls are handled by an operator, as was universaliy the case a few years 
back, all this information is recorded on a slip of paper, called a ticket. These 
tickets are then sent to a central point, where the charge is calculated and 
entered on the ticket. Each customer’s tickets are then sorted out manually 
and the items typed on his bill. 

This procedure has always been unsatisfactory. It is expensive; human errors 
are bound to occur more often than we like; and above all the work is monoto- 
nous and not attractive to stable and reliable employees. The rapidly growing 
use of the telephone over longer distances in recent years has greatly increased 
the quantity of work, and the labor cost has risen. There would be an im- 
portant need for mechanization of the process even if calls continued to be 
handled by operators. Where calls are dialed direct to their destination by the 
customer, a mechanical process is clearly necessary ; without it our objective of 
eventually enabling customers to dial direct to any telephone in the country 
would be impossible. 

With this as a background, the Automatic Message Accounting (AMA) sys- 
tem to which I refer consists of two parts. One, which is located in the central 
office, records the necessary data in the form of perforations in paper tape. This 
is done entirely automatically, even the identity of the calling telephone being 
determined by the mechanism without human aid. The other, which is located 
at a central point, processes these tapes, sorts out the items chargeable to each 
customer’s account, and computes the charge. At an appropriate stage of the 
process punched cards, suitable for use in standard commercial accounting 
machines, are automatically prepared, and thereafter the process of making 
out the customer’s statement follows the usual punched card routine. 

At the beginning of 1956, such AMA equipment was collecting and proc- 
essing the data for approximately 5,500,000 subscribers’ accounts. If the work 
were done manually, it would be necessary to write, sort, calculate and tran- 
scribe nearly 20,000,000 tickets per month. 

Our third major postwar activity in the computer field is one which we con- 
fidently expect will eventually open up an entirely new era. I refer to the in- 
vention of the transistor. This device will do almost everything now done by 
vacuum tubes in computers. It will consume much less power, and will be more 
reliable and more compact. Considerable progress has been made already in 
devising transistor circuits for digital purposes and in laboratory studies of 
computers based on them. It can be said with assurance that the era of the 
transistor computer is just around the corner. When it has fully matured, we 
shall have computers of greater capacity and speed than now, and even the 
most magnificent of our present systems such as NORC will no doubt become 





574 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


obsolete, even as the relay computers already are. The great forward surge of 
the computer art has not yet run its course. 

Some valid lessons can now be drawn from this experience and from the 
philosophic remarks with which I began. 

The first is, not to be impressed by bigness, nor mystified by such cant 
phrases as “electronic brains.” Instead, think of these devices, big or small, as 
agile but rather stupid slaves. Or avoid the human parallel entirely and think 
of all computers, from the earliest abacus to the latest Maniac, as labor-saving 
devices. 

When so judged, it is clear that there is not, and cannot be, any one machine 
which is best for all uses. This is the second lesson. Even the abacus is still 
preferred for adding laundry slips by men who long since adopted the electric 
mangle and cash register. 

The third lesson is a corollary of the second. Before a choice of automatic 
computer can be made, or even a wise decision reached that one is required at 
all, it is necessary to know what the requirements are. Are the problems strictly 
repetitive, as in message accounting; do they tend to follow set patterns, as our 
network problems do; or do they exhibit the great variety characteristic of 
general scientific studies? 

For problems of the first type flexibility is unimportant and a maximum of 
internal control is desirable. The AMA system, for example, which was de- 
signed to handle such a situation, has almost no human control whatever, even 
the problem data being introduced mechanically. It is nearly all memory, and 
the controls are rigid and built in. 

No professional personnel whatever are needed to operate such equipment. 
It is maintained by the same kind of skilled craftsmen who care for our dial 
offices, and operated by the kind of clerical personnel that would commonly be 
found in accounting offices. 

Our network computations afford an example of the intermediate situation. 
I presume it has many counterparts in other industries. Here the problems 
possess individuality and thus are not strictly routine, but certain formulas 
and algebraic processes tend to recur with great frequency, and constitute a 
large proportion of the total computing effort. Here we found a good solution 
by combining a large number of internal routines, which greatly reduced the 
labor of preparing instructions, with rather elaborate external controls, which 
preserved flexibility. 

Before this machine could be built, a very scholarly study was necessary to 
bring out clearly the mathematical patterns characteristic of the field of net- 
work design, to devise methods of computation compatible with the general 
plan of the proposed machine, to determine what internal routines could be 
used to greatest advantage, how much memory would be required, and so on. 
But once methods of handling the problems had been worked out, the equip- 
ment could readily be operated by technical assistants, and professional super- 
vision was reduced to a minimum. Also, the capacity for unattended opezation 
proved to be a great asset. 

I believe that other industries with set patterns of computational require- 
ments can confidently expect to find some equally satisfactory installation; and 
can expect to be able to operate it with assistance personnel as we do. But they 





THE AUTOMATIC COMPUTER 575 


must not expect to reach this stage without a planning period during which a 
careful study of the nature of their work, sound experience with computers and 
their possibilities, and considerable mathematical ingenuity will all be required. 
In some cases the industry may not have among its permanent employees a 
person with the required level of mathematical training and skill to make such 
studies. If not, the makers of commercial equipment can render great assist- 
ance, or the temporary services of a professional mathematician with suitable 
experience may be secured. However managed, this stage of careful and com- 
petent planning will be necessary if the best computer for the special situation 
is to be obtained. 

Where problems exhibit greater variety, flexibility is obviously a necessity. 
Swiss music box routines are less useful than before, since recurrent melodies 
are less common. Furthermore, every new problem presents a challenge to 
make optimum use of the available storage and control facilities in solving it, 
no matter what machine may be provided. In this situation heavy emphasis 
must be placed on choosing the men who use the machine. For this type of work 
the professional mathematician will be an asset as a permanent member of staff. 

Our experience at Bell Telephone Laboratories also teaches a fourth lesson: 
that complicated work can be quite competently done without the most 
elaborate equipment. We are‘an industrial laboratory devoted entirely to re- 
search, development, and design in the fields of communications and military 
electronics. We do no manufacturing and no selling. More than 3000 of our 
10,000 employees are professional engineers, physicists, chemists, and other 
scientists. Our activities range from basic research in the physical sciences, 
such as might be expected in the better universities, to the final detailed design 
of equipment for manufacture. They generate a wide variety of mathematical 
problems, some big and some small, some conventional and some that raise new 
questions of mathematical theory. It is in such situations that large general 
purpose computers such as NORC are most useful. Yet, until 1955, most of the 
problems which our research scientists evolved were solved on a CPC, which 
is a relatively modest machine. For those that were too involved to be handled 
this way we rented time on more elaborate equipment. 

Far more importart than the size of the machine is the caliber of the men who 
control it. Not only may an hour’s study by a competent man save hundreds of 
machine hours; what is more important, he may avoid entirely the solving of 
problems that never should have been solved, because they were not the right 
problems to begin with. Probably far more is being lost every day in computing 
because of inadequate humans than because of inadequate machines. With the 
right kind of men in charge of them, the computing machines may not work so 
many hours a day, but their output will be of greater value. 

Finally, in this matter of installing automatic computing equipment, it is 
not necessary to start off in a big way. It is both practicable and prudent to feel 
one’s way, to size up the general requirements of one’s situation, to seek sound 
advice wherever it can be found, and to get equipment that will meet these 
requirements in at least a better way than they are now being met. Then— 
especially if the problems are varied in character—man it well; this is the place 
to spend money at the start. For it takes an intelligent master to get the best 
out of these willing, but stupid, slaves. 





THE EFFECT OF RESPONDENT IGNORANCE 
ON SURVEY RESULTS* 


Rosert FERBER 
University of Illinois 

On numerous economic and social issues, opinions of the informed 
segment of a population may differ substantially from those of the un- 
informed, to judge by results obtained from an experimental] “public 
opinion survey.” Moreover the mere fact that a person is not informed 
about an issue does not deter him, as a rule, from offering an opinion 
when asked. Highly significant with respect to the validity of public 
opinion polls may be the marked tendency for those not informed about 
an issue (at times more than half of those interviewed) to lean toward 
a neutral position and for the misinformed to differ substantially in 
their opinions from the minority who were informed. 


LMOsT all surveys on political and economic issues, as well ar on brand 
preferences and other marketing questions, contain a large component of 
ignorance, in that respondents’ opinions are dictated by prejudice or propa- 
ganda. The marketing and public opinion literature is full of evidence of this. 
The extent of misinformation is clearly a matter of considerable interest, par- 
ticularly the extent to which opinions and attitudes obtained on surveys repre- 
sent informed viewpoints, and the extent to which the over-all results obtained 
are affected by differences between the opinions of the informed and of the 
uninformed.' 

The objectives of the present study are threefold, namely: 

1. What is the extent of ignorance on issues of current interest? 

2. How do the opinions of those informed on a particular subject differ from 

those of the uninformed? 

3. How do these differences affect opinion survey results? 

Given answers to the above questions, the problem then becomes one of how 
to make use of them to improve the validity of future attitude and opinion 
surveys. This matter is considered in the final section of this article. 

We begin with a description of the present study and how it was conducted.? 





* The author would like to thank the Graduate Research Board of the University of Ilinois for a grant whick 
made this study possible as well as the Bureau of E ic and Busi Research of the University for providing 
the necessary facilities. Howard McBride and Robert Zaruba rendered valuable assistance in analyzing the data. 

1 It would seem preferable to say “affected” rather than “biased,” for to do otherwise would imply a value 
judgment regarding the meaning of the true state of opinion on the subject. Thus, in certain circumstances it may 
be desired to ascertain prevailing state of opinion irrespective of one’s knowledge about the subject. Usually more 
significant for policy formation or marketing strategy, however, is the state of informed opinion coupled with in- 
formation on the nature and extent of uninformed opinion. 

2 Other direct attempts relating to one or more of the above three objectives of which the author is aware are 
Gleason, J. G., “Attitudes vs. information on the Taft-Hartley law,” Personnel Journal, 2 (1949), 293-99; Kubany, 
A. J., “A validation study of the error-choice technique using attitudes on national health insurance,” Educational 
and Psychological Measurement, 13 (1953), 157-63. In both of these cases, knowledge was not found to affect atti- 
tude on the subject. Additional references in this area, some oblique, are Fisher, B. R. and Withey, S. B., Big 
Business As the People See It, University of Michigan Survey Research Center, 1951; Bruner, J. 8., Mandate of the 
People, New York: Duell, Sloan and Pearce, 1944; Bruner, J. 8., “Public opinion and America’s foreign policy,” 
American Sociological Review, 9 (1944), 50-6; Newcomb, T. M., “The influence of attitude climate upon some deter- 
minants of information,” Journal of Abnormal and Social Psychology, 41 (1946), 291-302; Scott, W. A., and Withey, 


576 














EFFECT OF RESPONDENT IGNORANCE 577 


THE STUDY 


A sample of 600 individuals was selected by means of random numbers from 
the 1954 Champaign-Urbana city directory in Apri! 1955. These individuals 
were approached during May to July by experienced interviewers on the staff 
of the Bureau of Economic and Business Research of the University of Illinois 
in connection with a “survey of public opinion.” Following two opening ques- 
tions on the respondent’s opinion of business trends, his attitude was solicited 
on four questions of current interest, namely, on minimum hourly wages for 
labor, the guaranteed annua! wage, the effect of the rate of return on govern- 
ment bonds in inducing people to buy them, and the principle behind the fair 
trade laws. In each case, the individual was asked for his attitude on the sub- 
ject, his reasons for it, and a probing question relating to his knowledge of this 
subject. The individual was not compelled to express his views either way, but 
once he did express a definite view the interviewers were instructed to obtain 
some indications, by repeated questioning, of his knowledge of the subject. 
Thus, the questions on the guaranteed annual wage were, as follows: 


a. What is your attitude toward allowing labor to have a guaranteed annual wage: 


For. Against Neutral Don’t know. — No answer. 








b. (If an attitude is expressed) Why? 





c. As you interpret it, what do the unions mean by a guaranteed annual wage? 





The organization of the questions on the other three issues was similar. In the 
case of the Federal minimum wage requirement, the respondent was asked if he 
thought it should be raised, lowered or kept the same; then why he thought so; 
and then the figure for the existing minimum wage.’ On bond yields, the ques- 
tion related to whether raising the yield on government bonds would induce 
more people to buy them; why so; and then how the rate of return on govern- 
ment bonds compared with that of savings accounts in savings banks. On fair 
trade, the respondent was asked whether the present fair trade laws (there are 
such laws in Illinois) should be strengthened, abolished or left as is; why; and 
what he understood the principle of fair trade laws to be. 

In addition, information was solicited on magazines and periodicals read, 
newspapers read as well as the frequency of readership of particular features in 
the newspaper (local news, national and international developments, sports, 
etc., even comics), age, occupation, sex, status in household, and family income 
bracket. 

8. B., The United States and the United Nations—Attitudes, New York: Manhattan Press, 1956. Also (all in the Pub- 
lic Opinion Quarterly) Showel, M., “Political consciousness and attitudes in the State of Washington,” 17 (1953-54), 
394-400; Nafsiger, R. V., Engstrom, W. C., and Maclean, M. 8., Jr., “The mass media and an informed public,” 15 
(1951-52), 105-14; Ziff, R., “Voters’ bias in interpreting election poll predictions,” 12 (1948), pp. 326-28; Katz, D., 
“Three criteria: knowledge, conviction and significance,” 4 (1940), 277-84. The latter is a general theoretical discus- 
sion of the subject. 
+ A slight complication was introduced here by the fact that the Federal minimum wage was raised from 75 
cents to $1.00 before al! the interviews had been completed. During the week preceding the President's signature of 
the bill, the press gave considerable publicity to the impending signing of this bill, and during the week immedi- 


ately afterwards either answer was accepted as correct. Actually, few interviews were made during or after this 
period—19 out of a total of 402. 








578 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


A total of 647 names was selected from the survey including pretest names.‘ 
Of this number, 225 could not be contacted, 15 refused to cooperate, and com- 
pleted interviews were obtained with 407 people. The large number of people 
that could not be located is attributable in part to the high turnover that occurs 
after the end of the fall semester in this university community and in part to 
the fact that the names in the directory had been compiled in early 1954.5 
Comparison of the age, occupation and income distributions obtained from 
this survey with those of other surveys made at about the same time reveal no 
significant differences except for over-representation of women. 


THE EXTENT OF IGNORANCE 


Before attempting to gauge the extent of ignorance, one has to define what 
is meant by ignorance. From the standpoint of this study, there are three 
principal gradations of ignorance. These are: the informed, the “don’t knows,” 
and the misinformed. Informed respondents on each particular issue are de- 
fined as those who answered correctly the factual question in each case— 
knowledge of the present minimum wage, understanding of at least the idea 
back of the guaranteed annual wage, knowledge of the yield on government 
bonds relative to those of savings banks, and some comprehension of the prin- 
ciple behind the fair trade Jaws or of resale price maintenance. It is perhaps 
needless to note that the characteristics and number of informed respondents 
differs from one issue to another; the extent of this phenomenon is explored in 
a later section. 

“Don’t knows” are those who admitted not having any knowledge about the 
particular subject. Where an answer was given, it was generally clear whether 
the respondent was informed or misinformed. The criterion used to classify 
respondents in the two instances involving some judgment—the guaranteed 
annual wage and fair trade—was a rather liberal one. If the respondent showed 
any understanding at all of the concepts without at the same time making 
factually incorrect statements he was classified as “informed.” If only part of 
his statement was factually in error, he was classified as “partly misinformed.” 
If the entire statement was incorrect, such as an assertion that the guaranteed 
annual wage was designed to equalize all workers’ pay, or if he beat around the 
bush without answering the question, he was placed in the “misinformed” 
category.® 

Question may be raised on the reliability of the particular factual questions 
employed as indicators of one’s knowledge of an issue. In the case of the two 
questions on the meaning of the concepts involved—the guaranteed annual 
wage and fair trade—I believe that they do provide a fair indication of one’s 





4 The pretest interviews were combined with the regular interviews becanse the original field questionnaire 
worked out very well with the exception of one question. That question, one on farm parity, was replaced by the 
questions on fair trade laws, and it is for this reason that the number of answers on the fair trade questions is below 
those received for the others. 

* By hindsight, it is clear that the use of addresses rather than of names would have reduced sharply the num- 
ber of non-contacts and yielded a higher rate of response. 

* In the latter case, the respondent's true state of knowledge was invariably apparent. The interviewers were 
instructed to probe for an answer only as long as the respondent did not seem to answer the question and not to 

tagonize the respondent by pressing for an answer, 








EFFECT OF RESPONDENT IGNORANCE 579 


knowledge of the subject for the purposes of this study. Certainly, if a person 
is not aware of the basis for advocating a particular action or of what it in- 
volves, he can hardly be in much of a position to form a judgment regarding 
its desirability. On the other hand, where the question deals with numericai 
informetion, as in the case of the minimum wage and bank and bond yields, the 
connection with general knowledge of the subject is not necessarily conclusive. 
It is likely to be more so for the question on yields, for knowledge of at least the 
relative levels of the two types of investment would seem to be of some im- 
portance in forming an informed opinion regarding the effect on sales of higcz 
yields on government bonds. In the case of the minimum wage, however, it 
may well be argued that knowledge of the exact figure is not a prerequisite for 
having an informed opinion on whether the minimum wage should be changed 
or not. However, an approximate idea of this figure would be required for such 
a judgment, and for this reason any figure within 10 cents of the true figure was 
accepted as correct. 

Now we are ready to examine the findings of the study. First we ask, how 
prevalent was knowledge, or ignorance, of these issues among the sample 
members? Chart 1 provides some rather startling light on this question. The 
percentages in each bar of this chart have as their base the total number of 
people in the sample possessing that particular population characteristic. For 
purposes of this analysis, the “no answers” have been combined with the 
“don’t knows,” and the two “partially misinformed” respondents on the fair 
trade question were combined with the “misinformed” group. 

The summary bars at the bottom of the chart show that less than half of 
those questioned were informed on any of the four issues, and in one case—fair 
trade—less than a quarter of the respondents understood the concept involved. 
The uninformed were predominantly people who admitted the fact, with only 
a minority giving answers which turned out to be wrong. The “don’t knows” 
varied between 40 and 70 per cent of the total sample, while not more than 
15 per cent appeared to be misinformed on any one issue. 

The breakdowns by population groups indicate considerable variation in 
degree of knowledge. The salient points may be summarized briefly as follows: 

The proportion of people who were informed tended to 

decline for those aged 50 and over, 

rise with education, not surprisingly, 

rise among the more highly-educated occupational groups, with the notable 
exception that laborers appeared to be better informed than clerical- 
sales people, 

rise with income level, and 

be substantially higher among men than among women. 

The proportion of people who acknowledged not possessing the information 
- tended to 

rise with age, 

decline with education, 

decline among the more-educated occupational groups, 

decline as income rose, though the percentage appeared to turn upward 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


580 


SOILSIMALOVUVHH LNAGNOdSaY Lf FASS] AM AONVUOND] 40 GRUDAC] AO NOILAGTALSIG, 


I LavHO 


LN3983d 
oo! __o8 og Ov oz ) 001 











WLoL 











TIWwW34 


aww 














ES 
Py 
Bi 
§ 
8 66¢ 018-009'98 
g 68¢'98 -002'v8 IWOON! 
4 66\'>8-000'28 


00928 Y3I0NN 











WHOM 3SNOH 
O3viLFe 


wau0ev 





NOLLVdND30 
S7WS-WwIw719D 


TWNOISS 3400d 














BLwnovEd 393 TIO) 


7937110) 3WOS 
NOILWONG3 
OOHIS HOIH 





RRS avLNIWD73 
eee a mE ~] Seas SES — Ane “We A e_ AE — ~~ Tr 
Beis Se sly La a | SE ee UIAO ONY 05 
4 Be 
a | 
MON 1.NOO ~ OFWNHOAINIS I 4 MON LNOO~ G3WHONISIW 4 NONY LNOG 4 G3WNHOINICIN 4 


SOTA ONOS LAOS TSW CANNY O15 








re-02 









EFFECT OF RESPONDENT IGNORANCE 581 


again in the highest income group, those earning $10,400 per year or 
more, 
be substantially lower among men than among women. 
The proportion of people who were misinformed tended to 

fall with age,’ 

rise with education on two issues (college graduates were generally the 
most misinformed of all),’ 

be higher among managers and proprietors, with the notable exception of 
the guaranteed annual wage, : 

decline with income, with the noticeable exception of the Federal minimum 
wage for which the extent of misinformation rose with income, 

be higher among men than among women. 

An interesting feature of Chart 1 is the relation of information level to self- 
interest, particularly evident for the occupational and income breakdowns. 
Thus, laborers and the middle-income groups are most frequently informed on 
the minimum wage; professionals, managerial-proprietors and the highest in- 
come group on bond yields; and managerial-proprietors and the highest income 
group on fair trade. The guaranteed annual wage provides an exception in that 
laborers are not the best informed occupational group, though this may be due 
to the fact that the area sampled contained no firms having such wage plans. 

Most surprising, perhaps, is the general inverse relation between the char- 
acteristics of the don’t knows and those of the misinformed. Evidently those 
with negligible education were more likely either not to have the information 
or to admit not possessing it, whereas those with some education were more 
likely to venture an opinion which then turned out to be in error—which goes 
to bear out Pope’s warning that “a little learning is a dangerous thing!”* 

Of particular analytical interest is the concentration of ignorance. Do the 
informed and the ignorant in general appear to be the same people, or is there 
substantial variation from one subject to another? This is the subject of Chart 
2, which presents the distribution of frequency of ignorance of individuals by 
the same characteristics used in Chart 1. The percentages in this chart are 
computed in the same manner as those in Chart 1 with the total number of 
respondents having the particular characteristic as the base for each bar. 

Chart 2 exhibits a definite concentration of information (or of ignorance) in 
particular population groups, the pattern corresponding quite closely to the 
characteristics of the informed. This is not too surprising in view of the fact, 
from Chart 1, that much the same type of people appeared to be informed on 
each of the four issues. Chart 2 serves to confirm this point when the degree of 
knowledge is combined for all four issues. 

At the same time, a large proportion of the sample was evidently informed 
on one or two issues and not on the others, nearly half of the respondents falling 
in this category. Thus, there does appear to be substantial variation in knowl- 
edge from one issue to another, but this variation is not wholly erratic, ex- 
hibiting strong patterns by socio-economic groupings. 


7 Though in each case the tendency was statistically significant at the .05 level only for the question on mini- 





mum wages. 
* The same is true when the base for comparison is restricted to those not informed about the subject. The more 


educated were more likely than others to display unjustified self. 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 





EDUCATION 





CLERICAL~ “SALES 
LABORER ° ; 
RETIRED 

HOUSE WORK 


OCCUPATION 





UNDER $2600 
$2600- 84,199 
$4,200- 86,599 
$6,600- $10,399 
810,400 AND OVER 





MALE 
FEMALE 








PERCENT 


Cuart 2 


DISTRIBUTION OF DEGREE OF KNOWLEDGE BY FREQUENCY AMONG RESPONDENTS 


IGNORANCE AND OPINION 
Informed versus All Respondents 


What was the relationship between ignorance of the respondents and their 
opinions on this “public opinion” survey? To judge from Chart 3, substantial 
differences existed between the opinions of the informed and those of all re- 
spondents on three of the four issues. On the whole, the informed group was 
much more heavily in favor of raising the minimum wage, in. opposition to the 
guaranteed annual wage, and in favor of abolishing the fair trade laws. How- 
ever, on the issue of raising government bond yields, there was little difference 
of opinion, in the aggregate, between the informed and the rest of the sample. 
Chi-square tests bear out the above observations. Only on the issue of raising 
bond yields could the differences in the distributions of opinion between the 
informed and the rest of the sample have been caused by sampling variations 
at either the 5 per cent or the 1 per cent significance levels. 





EFFECT OF RESPONDENT IGNORANCE 


€ one ‘Lowen — fe moon \ 


CHANGING pees ee Zz;7_ Bas i 
MINIMUM WAGE ces YY ‘lua 


ently 


GTD. ANNUAL WAGE 


RAISING GOVT. 
BOND YIELDS 


STRENGTHEN LEAVE AS IS 
ABOLISH DON'T 


Yj 
ATTITUDE TOWARD LL 


FAIR TRADE YY 


ay 





PERCENT 


Cuart 3 


Oprnrtons ON Four Issuzs or ALL RESPONDENTS AND OF INFORMED RESPONDENTS 


Similar results are obtained if the practice is adopted of some opinion polls 
of excluding “don’t knows” and “no answers,” and basing the estimates of the 
distribution of opinions on only those registering an opinion. 


Nature of Respondent Ignorance 


Those admitting not knowing much about an issue were less likely to venture 
an opinion than either the informed or the misinformed, as is shown in Table 1 
(although even this group appeared likely to state an opinion about half the 
time). The misinformed seemed about as likely to venture an opinion as those 
who possessed more accurate information, a phenomenon subject to alternative 
interpretations. One is that the misinformed were in the main people who 
sincerely believed they possessed correct information. Another is that these 
people felt compelled to offer an opinion to save face in front of the inter- 
viewers. Either can be supported by these data. 

A rather striking difference in the type of opinions given on an issue between 
the don’t knows and the misinformed or the informed groups is a strong leaning 
of the former group toward a neutral position. This was particularly true for the 
group admitting not having knowledge about the subject. On each of the three 





584 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


issues where a neutral opinion was possible, a significantly higher proportion of 
“don’t knows” advocated this course than did the informed group. The same 
phenomenon was not so much in evidence among the misinformed group. 
Perhaps believing that they had correct information, these people did not 
hesitate to take a firm stand on an issue one way or the other. It is most inter- 
esting to note, however, that the nature of the stand differed substantially 
from that taken by the informed group. A significantly higher proportion of the 
misinformed (combining both of the subcategories) than of the informed were 
in favor of the guaranteed annual wage and advocated strengthenine, the fair 
trade laws. Only on the question relating to government bond yields was there 
any over-all correspondence of opinion irrespective of knowledge of the sub- 
ject: this was also the only question with but two possible answers. 


TABLE 1 
PROPORTION OF RESPONDENTS STATING AN OPINION BY 
ISSUE AND NATURE OF IGNORANCE 


Issue Don’t know Misinformed Informed 


Minimum wage 62.1% 
Guaranteed annual wage 47.0 
Govt. bond yields 82.7 
Fair trade laws 14.1 


Note: Each figure in the table represents the proportion of those respondents registering a particular degree 
of ignorance on an issue who ventured an opinion on it. Thus, of those who, as it later turned out, did not have an 
approximate idea of the then-current minimum wage, 62.1 per cent nevertheless stated an opinion. 


EFFECT ON SURVEY RESULTS 


We have seen so far that the uninformed (misinformed and don’t knows) 
constituted a major portion of the total sample on each of the four issues studied 
and that substantial differences existed between the distribution of opinions 
of these two groups on three of the four issues. This raises the basic question of 
what effect such differences might have on the results of a survey in this area. 

Such an effect is not readily ascertained because it must depend on how we 
define the “true” state of opinion and on the objective of the survey. Thus, if 
the objective of the survey is specified as determining people’s attitudes on 
certain issues irrespective of degree of knowledge, differences in opinion associ- 
ated with the latter are irrelevant. Although there may be situations in which 
such information would be desirable, it is more likely that degree of knowledge 
will be considered highly pertinent to the state of opinion, and we shall proceed 
on the assumption that it is the opinion of an informed population which is 
wanted. However, once a part of the population turns out to be uninformed, the 
further question arises of what is meant by the opinion of an informed popula- 
tion. In other words, given that the sample adequately represents the popula- 
tion and that the information is correctly obtained, what is the best estimate of 
the “true” state of opinion if all people were informed about the particular 
issue? Alternately, one might ask to what extent are survey results affected by 
arbitrary answers from “don’t knows” who really have no opinion but feel 
compelled to give an opinion? Answers to these questions would be required 
not only for purposes of information and, presumably, for policy formation but 





EFFECT OF RESPONDENT IGNORANCE 585 


also for dealing with the problem raised in this section—estimating the effect 
of lack of knowledge on survey results. 

The resolution of this problem clearly depends on the nature of the assump- 
tions made regarding the opinions the uninformed would have if they happened 
to become informed about the particular issue. Of the great number of possible 
such assumptions, the following three have been selected for use in this study: 

1. The uninformed would not change their opinions in any event, presumably 

on the theory that information on the issue would only aid them to 
rationalize a preconceived position. 

. Once informed, the uninformed would assume the opinions of those pos- 
sessing the same attributes of those characteristics which appear to be 
most closely associated with attitudes on that issue. This would appear 
to many to be a particularly reasonable assumption.’ 

. The “don’t knows” would assume the opinions of those possessing the most 
closely related similar characteristics, as above. However, the misin- 
formed would not change their opinions in any event, presumably be- 
cause, informed or not, they have already made up their minds and 
further information would only be used to rationalize their preconceived 
position. (In fact, it might be argued that in many instances misinforma- 
tion is not unrelated to the position taken.) The “don’t knows,” on the 
other hand, are presumed to be more open-minded and hence more likely 
to modify their opinion. 

There are many other assumptions that could be made about how the 
opinions of the uninformed might change, some more reasonable than those 
listed above. Yet, because those listed represent in a sense extremes of opinion 
change, it is of considerable interest to exami-«: the effect on the survey results 
of the validity of each of these assumptionsin turn. .- 

The effect on the survey results of the first assumption has already been pre- 
sented in Chart 3 and is carried over to Table 2. The effect of both the second 
and third assumptions is portrayed essentially by the last column in the table. 
The inferral of informed attitudes to those who are not informed produces, of 
course, a distribution for the entire population which would be exactly like the 
distribution given now for the informed in the last column. This is very closely 
true also if this inferral is made to the uninformed only, leaving the misinformed 
with their present opinion. Furthermore, this distribution is not disturbed 
greatly if adjustments are made either by occupation or by income as controls 
on the uninformed and on the misinformed. These were the characteristics most 
closely associated with attitudes on each of the four issues. 


IMPLICATIONS 


The findings of this study would seem to leave little doubt that on numerous 
economic and social issues opinions of the informed segment of a population 
are likely to differ substantially from those of the uninformed. Furthermore, 
on any particular ques.ion there can be no assurance a priori that it is one of 
those for which the ccstribution of opinion does not vary with knowledge of the 





® The big fly in this ointment is that many of these people had already formed opinions and consequently may 
be quite reluctant to change them even if they would have done so in other circumstances. It is also not unlikely 
that some ef these people may have what is tantamount to a vested interest in a particular attitude—as smaller 
retailers might on the fair trade laws—and would not change their position irrespective of degree of knowledge. 





586 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


TABLE 2 


DISTRIBUTION OF OPINIONS ON FOUR ISSUES AFTER DIFFERENT 
ADJUSTMENTS FOR EFFECT OF IGNORANCE 
Issue Opinion No change in Distribution of 
opinions informed only 
Minimum wage Raise 63.7% 74.9% 
Lower 4.6 2.6 
No change 31.7 


Total 100.0% 


Guaranteed annual wage For 46.3% 
Against 39.2 
Neutral 14.5 


Total 100.0% 


Gov’t. bond yields Yes 72.3% 
No 27.7 


Total 100.0% 


Fair trade laws Strengthen 20.5% 
Abolish 31.8 
Leave as is 47.7 


Total 100.0% 100.0% 


subject. These findings also supply some basis for inferring that the opinions of 
the same population are likely to be influenced strongly by the extent and dis- 
tribution of ignorance on the subject. They further raise the rather striking 
possibility that the high frequency of neutral opinions encountered on many 
polls may reflect not so much the “true” state of opinion as it does of a question- 
response phenomenon brought about by a high frequency of ignorance on the 
issue combined with a tendency on the part of the ignorant to lean defensively 
toward a neutral position. 

Given these findings, what can be done to alleviate the situation? The pro- 
cedure advocated in the past for avoiding this danger has been that of allowing 
the respondent full opportunity not to express an opinion when asked. Clearly, 
however, this is not enough, for many people evidently volunteer opinions 
when they are not informed about the subject or when they think they are in- 
formed but are not. Hence, a more drastic approach is needed, the nature of 
which is fairly evident. On issues where ignorance or misinformation is sus- 
pected to be fairly widespread, an opinion survey would have to ascertain the 
state of knowledge of the respondent as well as his opinion on the subject. 
With information on the state of knowledge, which to judge by the experience 
of this study is not at all difficult to obtain, an intelligent evaluation of the 
state of opinion is greatly facilitated. In view of the fact that ignorance on 
many issues is likely to be widespread, and that the opinions of the uninformed 
can differ substantially from those of the informed, the absence of such data 
can at times produce completely misleading interpretations of public opinion. 
Even with these data, the danger is minimized, not eliminated. 





A METHOD OF ESTIMATING THE INTERCENSAL 
POPULATION OF COUNTIES 


Apert H. Croserti 
Seattle City Planning Commission Staff 
AND 


Rosezrt C. Scumirr 
Honolulu Redevelopment Agency, City and County of Honolulu 


This paper describes a method of estimating the intercensal popula- 
tion of counties by multiple regression analysis of symptomatic data. A 
test of the proposed technique indicates it to be more accurate than any 
of four alternate methods. 


HIs study describes and tests a proposed technique for estimating the 
alone population of counties and similar geographic areas. The sug- 
gested technique (the “ratio-correlation method”) makes use of a multiple 
regression equation relating changes in population distribution during an inter- 
censal period to accompanying changes in the geographic distribution of live 
births, registered vehicles, and public school enrollment. Data for the year in 
question are then substituted in the equation to obtain estimated intercensal 
population. Both this method and four alternate techniques were tested by 
preparing 1940 population estimates from 1930, and 1950 U. 8. Census data 
for the thirty-nine counties of Washington State, then comparing the estimates 
to 1940 U. 8. Census totals. 

Intercensal population estimation should not be confused with the somewhat 
similar preblem of postcensal estimation. Intercensal estimates refer to dates 
between two past censuses; postcensal estimates, to the time since the most 
recent enumeration. Considerable attention has been given in recent years to 
the description and testing of postcensal estimating techniques. The problems 
of intercensal estimation, in contrast, have been largely disregarded. 

Relatively few methods appear with any frequency in the literature of the 
past decade. Simplest of the more commonly used techniques are arithmetic 
and geometric interpolation. Known and used for many years, they have been 
described most recently by Jaffe [1] and Wolfenden [2]. Arithmetic progres- 
sion assumes a constant intercensal amount of increase; geometric progression, 
a constant rate of increase. A more sophisticated approach is provided by the 
ratio method, heretofore used chiefly for forecasts and projections [3]. In its 
simplest form, the ratio method entails making an arithmetic interpolation of 
the county-to-state ratio of population between successive census years. This 
ratio is then applied to an existing independent estimate of population at the 
state level. A fourth technique, usually called the “vital statistics method,” 
was described with reference to postcensal estimates by Bogue in 1950 [4], and 
later applied to intercensal studies [5]. Bogue computed provisional estimates 
of population by applying estimated birth and death rates to the numbers of 
births and deaths, respectively, then averaged the two provisional results. A 


587 





588 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


somewhat more time-consuming method, used primarily by the U. 8. Bureau 
of the Census, requires the separate estimation of births, deaths, and net migra- 
tion [6]. The “ratio-correlation method,” initially presented in 1954 [7], is 
receiving its first application to intercensal analysis in the present paper. 

The first step in testing the ratio-correlation method was the derivation of a 
regression equation from data for the intercensal period. For testing purposes, 
a twenty-year span, from 1930 to 1950 was selected, although in actual practice 
the maximum intercensal period would be only ten years. 

The final choice of symptomatic data for the regression equation was based 
in large measure on zero-order correlation coefficients relating county popula- 
tion changes to the various series available. In each instance the dependent 
variable was the percentage of the state’s population residing in a given county 
in 1950 divided by the percentage living there in 1930, and the independent 
variable was a corresponding ratio for the symptomatic data. Five sympto- 
matic series were available on a county basis for all of the necessary years: live 
births, deaths, registered vehicles, registered voters, and public school enroll- 
ment. These series were found to have the following coefficients of correlation 
with respect to population: 


ee. WG coca cs cs cae ts tameadbus « Gen ee s os +0.96 
Public school enrollment...................... +0.95 
SUGUONNUE WOUNUNs oo c . - snoeewas ai cheers estan ok +0.92 
Sees WEEE... 0 cc Pearcy icercenereTkus +0.89 
ER ec erie cece cheek ccs th ck eee +0.82 


Three of the four series most highly correlated with population were then 
selected for multiple regression analysis, producing the equation, 


Xi.24 = 0.108910 + 0.223991X; + 0.157144X; + 0.446542X, 


in which Xj 24 was the computed intercensal (1930-1950) change in a county’s 
share of statewide population, X, was the 1950 county-to-state ratio of live 
births divided by the corresponding 1930 ratio, and X; and X, were comparable 
ratios for, respectively, registered vehicles and public school enrollment. The 
coefficient of multiple correlation was 0.987. It should be added that the de- 
cision to use three variables, instead of two or four or five, was an arbitrary 
one: it may well be that use of fewer variables would have had little adverse 
effect on the accuracy of the results, or that an additional variable would have 
significantly improved accuracy, although such findings would have mattered 
relatively little from a broad methodological point of view. The choice of 
registered vehicles instead of voters, despite a lower zero-order coefficient of 
correlation, was indicated by certain deficiencies in the published statistics. 
The method was applied and tested by substituting data for an intermediate 
year, 1940, in the above equation. For each county, the 1940 percentage of all 
live births in the state divided by the 1930 percentage was substituted for Xe, 
and corresponding insertions were made for X; and X,. The resulting value of 
Xi.2%4 represented the estimated rate of change in a given county’s share of 
state population between 1930 and 1940, that is, the 1940 per cent of state 
population divided by the 1930 per cent. This factor, applied to the 1930 per- 





ESTIMATING INTERCENSAL POPULATION 589 


centage of the state total, gave the percentage estimated for 1940. The per- 
centages were then adjusted to sum to 100.00 for all thirty-nine counties, then 
applied to the 1940 enumerated population of the state to obtain estimates of 
absolute county population. (In actual practice, the state figure would be an 
estimate only, such as provided by the U. S. Bureau of the Census [6]. For 
purposes of the present study, it was assumed that an exactly accurate state 
estimate for the intercensal year had been made.) 

The resulting county estimates were then compared to totals from the 1940 
U. 8. Census. Percentage deviation of each estimate from the census figure, 
signs disregarded, was taken as the measure of relative accuracy. 

The method under study resulted in a smaller percentage of error than any 
of the four alternate methods with which it was compared. Average “error” of 
the ratio-correlation method was 6.0 per cent, well under the average for either 
the vital statistics method (9.5 per cent), the ratio method (11.1 per cent), 
geometric interpolation (also 11.1 per cent), or arithmetic interpolation (13.9 
per cent). A more detailed comparison appears in the following table: 


Arith- Geo- Vital Batie- 
Per cent error ‘ ; Ratio oes correla- 
metic metric statistics 2 
tion 
Dates 2s oie cess Mos vi 39 39 39 39 39 
Less than 5.0 per cent........ 15 16 11 15 17 
5.0 to 9.9 per cent............ 8 7 16 9 15 
10.0 to 24.9 per cent.......... 12 13 9 13 7 
25.0 per cent or more......... 4 3 3 2 -- 
Maximum error (per cent).... 158.5 96.8 106.1 30.2 16.6 
Average error (per cent)...... 13.9 11.1 11.1 9.5 6.0 


Accuracy of the ratio-correlation method varied with both absolute popula- 
tion of the county and the growth rate of the county during the entire inter 
censal (1930-1950) period. As might be expected, estimates for populous coun- 
ties were considerably more accurate than estimates for small counties. A more 
surprising finding was the higher degree of accuracy obtained for rapidly grow- 
ing counties than for stable or declining areas. It is noteworthy that the error 
for Benton County was only 6.2 per cent, despite an extreme acceleration in 
growth for that area after 1940 (from 10,952 in 1930 and 12,053 in 1940 to 
51,370 in 1950). Relevant data for the thirty-nine counties are as follows: 


Population in 1950 Per cent growth, 1930-1950 
Per cent error Less than 25,000 Less than 30.0 
25,000 or more 30.0 or more 
MEN: @ <a cabana ctncte 4 cecten 20 19 19 20 
Less than 5.0 per cent........ 5 12 8 9 
5.0 to 9.9 per cent............ 9 6 6 9 
10.0 per cent or more......... 6 1 5 2 
Maximum error (per cent)... . 15.0 16.6 16.6 15.0 
Average error (per cent)...... 7.6 4.3 6.7 5.2 





590 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


Although the implications of the above tests seem reasonably clear. it should 
be kept in mind that the tests are by no means exhaustive. Among their more 
important limitations are the following: 

1. The tests were confined to the middle year of a given twenty-year span. 
The effect of a shorter intercensal period (as would indeed be the case in actual 
practice) could not be determined from available data. No effort was made to 
compare equations for different periods. Accuracy of the methods under dis- 
cussion for years other than the mid-point of the period was impossible to 
ascertain. : 

2. The tests were confined to data for a single state. It is evident, however, 
that the ratio-correlation method is applicable to data for other states as well. 
Similar equations should be developed and tested for these other areas. 

3. The effect of introducing modifications into these techniques, either the 
ratio-correlation or various alternate methods, was not studied. As noted 
earlier, the use of fewer, more, or simply different variables could conceivably 
produce results of equal or superior accuracy. Another promising modification, 
possible for the vital statistics method as well as the ratio-correlation method, 
would be the use of three-year averages to help eliminate random fluctuations 
in symptomatic series. 

The foregoing tests, whatever their limitations, nevertheless indicate the 
ratio-correlation method to have considerable value for estimating the inter- 
censal population of counties. Its average error, only 6.0 per cent in a twenty- 
year intercensal period, was found to be considerably lower than averages for 
each of four well-known alternate techniques. The proposed method would 
appear to be eminently suitable for use at the state level as well, where a greater 
selection of symptomatic data is available. The need for accurate intercensal 
population estimates should in all events prompt continuing research in this 
field, with respect to both the ratio-correlation method and other techniques. 


REFERENCES 


[1] U.S. Bureau of the Census, Handbook of Statistical Methods for Demographers (prelimi- 
nary edition—second printing), by A. J. Jaffe, Washington, D. C.: U. 8. Government 
Printing Office, 1951, Chapter VII. 

[2] Wolfenden, Hugh H., Population Statistics and Their Compilation (revised edition). 
Chicago: University of Chicago Press, 1954, 80-4. 

[3] Schmitt, Robert C., “Forecasting city population by the ratio method,” Journal of the 
American Water Works Association, 46 (1954), 960-2. 

[4] Bogue, Donald J., “A technique for making extensive population estimates,” Journal 
of the American Statistical Association, 45 (1950), 149-63. 

[5] Kitagawa, Evelyn M., and Bogue, Donald J., Suburbanization of Manufacturing Activ- 
ity within Standard Metropolitan Areas (Studies in Population Distribution No. 9). 
Oxford, Ohio, and Chicago: Scripps Foundation for Research in Popuiation Problems, 
Miami University, and Population Research and Training Center, University of Chi- 
cago, 1955, 6-7. 

[6] U. 8. Bureau of the Census, “Estimates of the Population of States: July 1, 1940 to 
1949,” Current Population Reports, Population Estimates, Series P-25, No. 72, May 
1953. 

[7] Schmitt, Robert C., and Crosetti, Albert H., “Accuracy of the ratio-correlation meth- 
od for estimating postcensal population,” Land Economics, 30 (1954), 279-81. 





RESEARCH ON METROPOLITAN POPULATION: 
EVALUATION OF DATA 


Ot1s DuptEY Duncan 
University of Chicago 

The proximity of the 1960 Census makes timely an evaluation of ex- 
isting census data from the standpoint of research uses; suggestions in- 
cluded in this paper are meant to furnish a basis for discussing plans for 
the forthcoming censuses. Despite many recent improvements, census 
data present numerous problems of quality, comparability, detail, and 
area classification of data, particularly in studies involving area sub- 
divisions of standard metropolitan areas or temporal comparisons. 
Among the needed improvements are (1) better coordination of the area 
units of statistical reporting, (2) reconstruction of historical series of 
metropolitan population and economic data, and (3) fuller recognition 
of metropolitan population as a distinctive residential category. 


T CAN be assumed that a large proportion of the research on metropolitan 

population in the foreseeable future will be conducted within the broad 
framework already laid down by previous studies. Hence investigators will be 
making both cross-sectional comparisons and trend analyses. Cross-sectional 
studies involve comparisons among individual metropolitan units or classes of 
units, comparisons between metropolitan and nenmetropolitan units, and com- 
parisons among component parts of metropolitan units. Trend analyses involve 
the same comparisons, with the added dimension of time. 

Leaving aside the possibilities of case studies of individual metropolitan 
units, it is likely that the bulk of future research on metropolitan population 
will continue to depend heavily or exclusively on data compiled by Federal 
statistical agencies. It should be observed at the outset, then, that such data 
are now available in greater quantity and in a more appropriate form for re- 
search on metropolitan population than at any time in the past. The partial 
coordination of Federal statistics in terms of the Standard Metropolitan Area 
concept is an admirable achievement, one which opens up many lines of re- 
search not possible in the past: In fact, it seems unlikely that the number of 
research workers now in the field of metropolitan population is more than a 
small fraction of the number required adequately to exploit the available data. 

Despite this generally favorable situation, the research worker is confronted 
with numerous problems of the adequacy of data, including problems of the 
suitability of the area units by which the data are tabulated, problems of the 
detail in which data are available, problems of the quality of data, and prob- 
lems of the comparability of data among units and, especially, over time. This 
paper attempts only to indicate some of the mere important problems of this 
kind, with respect to the data on population and labor force characteristics, 
migration, vital statistics, housing, business, and manufactures. 

Some of the virtues of the §.M.A. (Standard Metropolitan Area) as a unit 
for research purposes entail troublesome defects. Because it is a standard unit 


591 





592 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


for many statistical series, the problem of correlating these series is greatly 
simplified. But in introducing the standard unit it was necessary to sacrifice 
comparability with earlier data based on different units. Thus, one cannot com- 
pare the Metropolitan Districts of 1940 and earlier censuses with the 8.M.A.’s 
of 1950. To some extent this can be remedied by reconstructing S.M.A. data 
for earlier periods by combining counties,' but this involves much labor, and 
can be carried out only for a rather limited range of data. 

Again, while the construction of 8.M.A.’s out of county building blocks 
simplifies statistical coordination, it impairs a number of kinds of comparisons 
among metropolitan units. The county is by no means a uniform unit of land 
area. In fact, there is an inverse relationship, on a state-by-state basis, between 
average size of county and population density. Hence, 8.M.A.’s tend to be made 
up of rather gross area units in regions of low population density and rather fine 
units in regions of high density. This tendency is enhanced by the rule of using, 
not counties, but minor civil divisions as the components of §.M.A.’s in New 
England (incidentally, in violation of the principle of maximizing statistical 
coordination). As a result, population density or any other area-based index 
cannot usefully be compared among 8.M.A.’s. Moreover, there is undoubtedly 
great variation among 8.M.A.’s in the extent to which their boundaries en- 
compass the area of peripheral population growth. This renders equivocal the 
comparison of total growth rates, and particularly the comparison of peripheral 
growth rates among 8.M.A.’s. 

Mention of peripheral growth rates reminds one of another difficulty. It is 
usual to divide the 8.M.A. into two parts, the central city or cities and the 
metropolitan ring. It is questionable whether this division warrants close com- 
parisons among S.M.A.’s with respect to such matters as the relative growth 
rates of central city and ring, differential population characteristics of central 
city and ring, or movement of population and economic activities between 
central city and ring. In some S.M.A.’s the nominal central city is surrounded 
by large, densely populated, incorporated suburbs, while in others the corpo- 
rate boundaries of the central city are virtually coextensive with the limits of 
the core of urban settlement in the 8.M.A. For many studies—with the notable 
exception of those involving considerations of local government administration 
—such variation can only be regarded as accidental and a serious disturbance 
to meaningful comparisons. The problem can be overcome, to an extent, by 
taking as the metropolitan nucleus the Urbanized Area (also a recent innovation 
in tabulation units), but this raises additional problems. The delineation of 
Urbanized Areas was not closely coordinated with the definition of S.M.A.’s. 
Consequently, there are some 8.M.A.’s which contain no Urbanized Areas. 
In many cases the boundaries of Urbanized Areas extend beyond those of the 
corresponding 8.M.A.’s, and in some cases an 8.M.A. may contain parts of 
two Urbanized Areas. Finally, only certain population and housing statistics 
are available for Urbanized Areas, and it is difficult or impossible to compile 
other data for these units.? 





1 This has been done, for total population only, by Donald J. Bogue in Population Growth in Standard Metropoli- 
tan Areas 1900-1950, with an Explanatory Analysis of Urbanized Areas (Washington: Government Priniing Office, 
1953). 

2 See Bogue, op. cit., Chapters 1 and 3. 





METROPOLITAN POPULATION DATA 593 


If it is true that the centrai city-ring division is too gross and unreliable to 
afford a satisfactory basis for studies of the internal areal structure of the 
metropolitan unit, it is also true that there is a danger of overlooking certain 
resources for such studies. In a recently completed study of population char- 
acteristics of the Chicago S.M.A. it was possible to compile statistics separately 
for the central city; large, medium-sized, and small suburbs; urban, rural-non- 
farm, and rural-farm parts of the rural-urban fringe; and urban and rural parts 
of the outlying satellite area.* This involved tabulations from the census tract 
summary cards for the area defined as the 1940 Chicago Metropolitan District. 
These cards permit one to group urban tracts by size of place, and to segregate 
the urban and rural parts of tracts which are of mixed residential classification. 
The differences among components of the 8.M.A. discovered by this tabulation 
suggest the general advisability of making finer areal subdivisions of metro- 
politan areas than are published. A quick check of the 25 Urbanized Areas of 
a half million population or more indicates that there are probably 10 with a 
sufficiently large adjacent tracted area to make this kind of study quite 
feasible,* and an additional 6 for which it might or might not be worthwhile. 

In addition to the need for detailed area subdivisions of 8.M.A.’s, some types 
of investigation require a broader concept of the metropolitan unit than the 
8.M.A. Thus, Bogue has constructed a system of “metropolitan communities” 
covering the entire United States, and Hawley has worked with a concept of 
the “extended metropolitan area.” Experiments along these lines encounter the 
obstacle that metropolitan areas can be delineated only by using as building 
blocks area units for which data are published. It may be that further research 
of this kind will demonstrate a need for a new type of metropolitan unit for 
statistical tabulations. 

Perhaps the most favored sector of metropolitan research, from the stand- 
point of richness of data, is that concerned with cross-sectional comparison of 
population characteristics. Extensive tabulations are available for the larger 
S.M.A.’s, and to a considerable degree these can be matched with data for 
central cities or urbanized areas, or as indicated above, even finer subdivisions 
of the 8.M.A. in some cases. It might be pointed out that in providing 8.M.A. 
tabulations the Census Bureau has, in effect, recognized a residential category 
of “metropolitan population” which cuts across its standard urban-rural resi- 
dential classification. The metropolitan counties of the United States contain, 
besides over three-fourths of the total urban population, nearly three-tenths of 
the rural-nonfarm, and one-ninth of the rural-farm population (see Table 1), 
in a land area which is only one-twentieth of the country’s total. Perhaps it 
would be well to carry out the implications of the §8.M.A. concept to their 
logical conclusion, and to use the metropolitan-nonmetropolitan dichotomy as 
a regular residential classification in conjunction with and as a supplement to 
the conventional one. Ample justification for this step exists in the finding that 
the metropolitan population has strikingly different characteristics from the 
nonmetropolitan, even within the urban, rural-nonfarm, and rural-farm cate- 





3 Otis Dudley Duncan and Albert J. Reiss, Jr., Social Characteristics of Urban and Rural Communities: 1950 
(New York: John Wiley and Sons, 1956), Chapter 12. 

4 Chicago, Dallas, Kansas City, Minneapolis-St. Paul, New Orleans, New York, Portland (Oregon), Provi- 
dence, San Francisco-Oakland, and Washington. 





594 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


TABLE 1 


PERCENTAGE DISTRIBUTION OF POPULATION BY URBAN AND RURAL 
RESIDENCE AND TYPE OF COUNTY, FOR THE UNITED STATES: 1950 


Residence All Metropolitan Nonmetropolitan 
counties counties* counties 
Total 100.0 56.8 43.2 
Urban 100.0 76.6 23.4 
Rural nonfarm 100.6 28.9 71.1 
Rural farm 100.0 11.5 88.5 
Total 100.0 100.0 100.0 
Urban 64.0 86.4 34.7 
Rural nonfarm 20.7 10.5 34.0 
Rural farm 15.3 3.1 31.3 


* As listed in Table 2.01, National Office of Vital Statistics, Vital Statistics of the United States, 1960, Vol. I 
(Washington: Government Printing Office, 1954). 


gories.® There are also precedents for this step in the adoption by the National 
Office of Vital Statistics of the metropolitan-nonmetropolitan classification of 
counties as a standard tabulation procedure, and in the summary tabulation 
of nonfarm housing characteristics for all 8.M.A.’s combined. 

There is still another problem with residence classifications in research on 
metropolitan population characteristics. It is becoming increasingly important 
to recognize the distinction between the place of residence and the places at 
which work, consumption, and leisure activities are engaged in by inhabitants 
of metropolitan areas. Tabulations by place of residence no longer serve to 
characterize adequately the spatial orientations of a fluid population. In the 
Chicago study already referred to little difference could be found between the 
characteristics of the nominally urban and the nominally rural-nonfarm resi- 
dents of the fringe. Probably these two categories of people, on the whole, lead 
rather similar lives. A more meaningful approach, perhaps, would distinguish 
workers in urban pursuits from those with rural occupations, or commuters 
from those employed locally. 

Among the least satisfactory data for metropolitan studies are those per- 
taining to migration. This is especially unfortunate in view of the overwhelming 
importance of migration, as compared with natural increase, as a factor in 
metropolitan population redistribution—both within metropolitan areas and 
between metropolitan and nonmetropolitan areas. It is, of course, possible to 
make useful residual estimates of net migration over a 10-year period, and such 
estimates are probably as accurate for metropolitan units as for other kinds of 
areas. Such estimates, however, reveal little of the socio-economic selectivity 
of inter- and intra-metropolitan migration, and do not permit more than 
hazardous inferences as to the direction and magnitude of the streams of 
migration whose net balance is observed. The 1935-40 tabulations of migration 





* Duncan and Reiss, op. cit.; Otis Dudley Duncan, “Gradients of urban influence on the rural population,” 
Midwest Sociologist, 18 (Winter 1956), 27-30. 





METROPOLITAN POPULATION DATA 595 


between subregions and the 1949-50 tabulations by State Economic Areas 
yield data which are not entirely comparable with population tabulations for 
8.M.A.’s, because the latter frequently lie in more than one State. Probably 
no one knows how accurately these statistics depict the volume of migration 
into and out of metropolitan areas. But the writer believes that there is con- 
siderable underreporting of migration, e.g., in the case of Negro migrants to 
northern metropolitan areas. While the 1935-40 statistics are so tabulated as 
to yield information on movement between central cities and outlying parts 
of metropolitan areas, there is reason to believe thet misreporting of 1935 
residence produced some serious errors in the data. The tabulated 1949-50 
migration statistics do not attempt to show streams of intra-metropolitan move- 
ment. The limitation of the migration data in the 1950 Census to movement 
over a one-year period and the collection of this information for only a 20 per 
cent sample mean that frequencies are often too small to permit reliable com- 
parisons of detailed characteristics of migrants. Confining the inquiry to a 
sample, probably, also increased the rate of non-response. 

Birth and death statistics for metropolitan units are available in considerable 
detail for 1950. Recent improvements in the completeness of vital registration 
and the tabulation of statistics by place of residence make these data a valuable 
resource. However, particularly for intra-metropolitan studies, one must be 
aware of the substantial errors in reporting of place of residence and of the lack 
of comparability between census and vital statistics as to the classification of 
urban and rural population. The writer’s impression is that the system for com- 
piling marriage and divorce statistics has not yet developed to the point where 
it contributes much to the study of metropolitan population. 

Most of the remarks already made about. statistics on population character- 
istics apply as well to the available housing statistics. Extensive and revealing 
cross-tabulations of housing characteristics for nonfarm dwelling units in 
8.M.A.’s constitute a challenge to the analytical resourcefulness of research 
workers. Unfortunately these statistics are less useful than they might be for 
the simple reason that the different tables do not maintain the same class in- 
tervals or classification categories from one table to another. This is a rather 
surprising lapse from the Census Bureau’s generally high standards of presenta- 
tion. 

One of the major problems in relating metropolitan population distribution 
and characteristics to economic activities is that the dates of the Censuses of 
Business and Manufactures and the Censuses of Population and Housing are 
not the same. When population growth and economic expansion are rapid this 
poses considerable problems of comparability. Some minor problems are oc- 
casioned by the disclosure rule, which ;equires suppression of any information 
which would permit one to infer the characteristics of an individual business or 
manufacturing establishment. More serious is the fact that the industry break- 
down of manufacturing in the 1947 Census is available only for 53 8.M.A.’s 
reporting 40,000 or more manufacturing employees. This means that industry 
statistics, on an establishment basis, are not available for a considerable number 
of the smaller 8.M.A.’s which are actually manufacturing centers, with a half 
or more of their labor force employed in manufacturing. Moreover, the industry 





596 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


classification is not available for the division of 8.M.A.’s into central cities and 
rings. Finally, for both business and industry statistics the areal basis of report- 
ing is the place of work; this means that one encounters serious difficulties in 
comparing economic with population data in studies of relatively small areas. 

The foregoing remarks have not stressed problems of time comparability 
but these are perhaps the most difficult ones faced by the student of metro- 
politan population, though varying in difficulty from one type of statistics to 
another. 

Certain general conclusions seem warrauted: (1) In general, the adequacy of 
data varies inversely with the complexity of the research design and with the 
analytical precision attempted. This becomes a more pressing problem to the 
extent that research workers no longer feel satisfied with just describing gross 
trends and relationships and attempt to push their studies in the direction of 
greater refinement. (2) In general, the available data are better suited for cross- 
sectional comparative studies as of a recent point in time than for studies of 
trends. There are relatively few statistics suitable for metropolitan studies 
which are available, with reasonable comparability, in time series of more than 
two or three decades. (3) In general, the available data are more suitable for 
studies in which the metropolitan area is handled as an indivisible unit than for 
those attempting areal breakdowns within metropolitan units. 

One recognizes, of course, that not all statistics are compiled solely for the 
convenience of students of metropolitan population. But these students would 
be remiss if they failed te make known their high priority needs. The following 
list of needs is but a partial statement of one individual’s views, and should be 
discounted and supplemented accordingly: (1) There is a need for further co- 
ordination of areal bases of statistical reporting, to obviate such anomalies as 
the minor civil division basis of §.M.A. delineation in New England; the dis- 
junctiveness of 8.M.A., State Economic Area, and Urbanized Area units; and 
the parallelism and overlapping of the rural-urban and metropolitan-non- 
metropolitan residential classifications. (2) There is a need for a comprehensive 
reconstruction of statistical time series for metropolitan areas through the con- 
solidation of available past data on the building blocks of which they are con- 
structed. This is a task beyond the competence of most private investigators. 
(3) There is a need to maintain comparability of future statistical compilations 
with those presently available in regard to the tabulation units for metropolitan 
statistics. Hence innovations and improvements in metropolitan statistics must 
not be bought at too high a price in terms of loss of comparability, and splicing 
statistics should be provided where possible. 





MEASURING SPATIAL ASSOCIATION WITH SPECIAL CON- 
SIDERATION OF THE CASE OF MARKET ORIENTATION 
OF PRODUCTION 


Wiutut1am WaRntTz 
American Feographical Society, New York* 


EOGRAPHERS are frequently interested in aspects of “populations” which are 
(3 ignored or obscured by the use of standard statistical methodology. One 
of the basic characteristics of a “population” is its spatial distribution. This kind 
of distribution is, however, one that does not admit of easy and precise measure- 
ment and description. 

There also exists the problem of developing an adequate technique for meas- 
uring the spatial association of geographically distributed phenomena. It is the 
purpose of this paper to attempt to present such a measure and to indicate one 
of its applications. 

In considering the problem of the development of a measure of spatial 
association of geographically distributed phenomena, two aspects must be dis- 
tinguished. 


A. SITE CHARACTERISTICS 


With regard to what the geographer has come to call “site characteristics,” 
correlations present no special problems. Site characteristics refer to the nature 
of a place. They are conditions which are not necessarily due to spatial ar- 
rangements. A given point can be considered as concommitantly experiencing 
a variety of site factors. Hence, a point (or small area) may have fertile soil, 
income tax laws, union regulations, a long growing season, rock bearing 50% 
iron by weight, etc. The site factors are those conditions with which the place 
under consideration is “equipped”—the nature of the place. In principle, site 
characteristics can be found from measurements just at one place. 

Correlations using data such as these face no special problems. For example, 
if the effect of all other factors can be removed, a relationship between warmest 
month average temperature and, say, yields per acre of a given crop can be 
discovered by taking the paired observations from a large number of points or 
small areas. But, this is not spatial correlation comparing the geographies of 
two populations. 


B. POSITION CHARACTERISTICS 


The geographer recognizes not only “site,” but also the “position character- 
istics” of a place. Defined simply, position may be thought of as the locational 
relationship of a place to the rest of the world. More specifically, one can con- 
sider the location of a place as regards the production of coal or perhaps wheat 
or any other thing of concern. How then can a measure be developed which indi- 





* This paper was prepared while the author was a member of the faculty of the Wharton School of the Univer- 
sity of Pennsylvania. The opportunity is taken to thank John Q. Stewart, Princeton University, and Almarin Phillips, 
University of Virginia, for their many kindnesses and considerations. 


597 





598 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


cates the locational relationship of a given place to the aggregate of, say, the 
annual wheat production in the United States as the bushels (i.e., the units in 
the wheat population) are actually geographically distributed? 

In addition, if an attempt is to be made to measure the degree of spatial 
association in the distributions of two distinct populations, each one made up of 
a number of units, certain obstacles are encountered. These difficulties arise if 
the point on the earth’s surface cannot simultaneously possess units of both 
populations, or for that matter, more than one unit of either population. 

John K. Wright has suggested a solution to this problem! which involves 
systems of areal subdivisions recognizing that if points be expanded to areas 
then the areas can have varying amounts of either or both populations. Wright 
suggests that maximum areal association be considered as occurring when the 
ratio of units in Population A to units in Population B is the same for each 
areal subdivision. If there is not at least one subdivision containing a unit from 
each population then the degree of areal association is considered to be zero. 
Wright clearly points out that any value found (and the steps necessary to 
find values between zero and one are not here considered—-readers should see 
Wright’s article) must be considered as a relative coefficient dependent upon the 
system of subdivision. If the areal subdivisions be made small enough the co- 
efficient becomes zero since in the limit the units of the populations occupy 
wholly different points. (Note: another objection lies in the fact that the method 
does not admit of negative correlation.) 

In discussing his coefficients of areal association of two populations and also 
coefficients of evenness of distribution of a single population, Wright despairs 
of finding “absolutes” not based on arbitrary areal subdivision. He does, how- 
ever, caution that “distances and directions of displacements” would apparently 
have to be considered. 

It is toward this problem that the major part of this paper is directed. 


C. POSITION AND SPACE POTENTIALS 


For the purposes of social science it may be useful to assume that a popula- 
tion concentration exerts an “influence” or force which varies directly with the 
size of the population and inversely with distance from it. Let those unwilling 
to accept the notion of “influence” think in terms of accessibility. Populations 
are most accessible from the location standpoint at the place they occupy. 

The level of “influence” or accessibility around a given population concen- 
tration could be mapped using a simple contour technique and expressed in 
terms of persons per mile (or some other convenient persons/distance units). 

However, many population concentrations may exist in an uneven distribu- 
tion over a given area so the population force (or accessibility) at a given 
point is contributed to by many concentrations as they are geographically dis- 
tributed, indeed by all persons as they are geographically distributed. 

This idea can be expressed as follows: 


1 
v.= { —das 
r 





1 John K. Wright, “Some measures of distributions,” Annals of the Association of American Geographers, 
XXVII (1937) 177-211, 





MEASURING SPATIAL ASSOCIATION 


letting 
V.=influence at any point C in the plane 
D=surface density of the mass (the familiar density of population) over the 
infinitesimal element of the area dS. 
r=the distance from the element under consideration to point C. 
¢ the integration be extended to all areas of the plane where D is not zero. 

Thus, if it be known how population is distributed over the area. the accessi- 
bility in the aggregate can be computed for every point. The results can be 
portrayed on a map of the surface by use of contours analogous to contours on 
a topographic map with the “hills” here contoured being those of aggregate 
population influence or accessibility. 

The reader familiar with physics will notice the isomorphic relation of the 
above to the Newtonian notion of gravitation and the LaGrange concept of 
potential. For this reason, John Q. Stewart identified the V in the above equa- 
tion as “Population Potential.”? 

Newton’s original statement of the law of gravitation involved the notions 
of a particle of mass M at point A at a distance d from a second particle of mass m 
at point a. The force F that acts on each mass attracting them along the line 
joining them varies inversely as the square of the distance, thus: 


F = GMm/@. 


G is the universal gravitational constant 
The mutual energy E of the two masses in this gravitational field is as follows: 


E = GMm/d, 
and the gravitational potential V4 which the mass m produces at point A is: 
Va = Gm/d. 
The potential V, which mass M produces at a is: 
V. = GM/d. 


But, as previously noted, there may be many masses distributed through 
space and the above equations apply to each pair in turn. Consequently the 
integration is necessary if the potential is to be calculated for any point. 

Stewart and others have presented a great variety of evidence to show that 
the above equations are applicable to the interrelations of people on the average 
in the study of social phenomena. All that has been necessary is to substitute 
number of people for mass. 

To the social physicist, population potential is yet another aspect of the uni- 
versal law of gravity. The geographer finds in it au extremely useful way of 
quantifying the all-important “position” factor. To the statistically-minded 
the potential is a special sort of spatial moving average. 

Population potentials as calculated above involve each person at unitary 
weight. However, if the concept of potential is to be used for economic analysis 
it should be recognized that each person is not an equally effective economic 
unit. For certain applications, it seems logical to weight people by their incomes 





2 This idea was developed by Stewart as early as 1939. For a discussion, see his “Demographic Gravitation: 
Evidence and Applications,” Sociometry, XI (1948) 31-58. 





600 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


as a kind of “molecular weight.” A potential of population so calculated as to 
consider not only numbers of people and their geographic distribution but also 
their incomes might well be called a Gross Economic Population Potential or 
Income Potential. Figure 1 is such a map* based on the weighted average an- 
nual values in the United States for the ten-year period 1940 to 1949. 

If total income payments to individuals can be identified with total effective 
demand in a society, i.e., total possible claims that can be made on economic 
goods overlooking the temporary influences of past savings and credit, then a 
map of Gross Economic Population Potential can be considered as showing 
the geographic variation in intensity of effective demand. Spatial dimensions 
have been added to the aggregate of income. It is suggested that the National 
Income portrayed in this manner includes the very real aspect of geographical 
distribution. 

The values of GEPP can also serve as measures of market accessibility. The 
way in which proximity to total market varies geographically can be observed.‘ 

Just as a map of potential of human population shows the geographical vari- 
ation of accessibility to people in the aggregate, so would a similar map based 
on a non-human population such as wheat (as its production is actually geo- 
graphically distributed) show the spatial variation of the proximity to that 
wheat in the aggregate. Such a map can be considered also as showing the geo- 
graphical variation in the intensity of supply of that commodity. Figures 2 
and 3 are Supply Space Potential maps for wheat and ice cream respectively. 

In the author’s Geography of Price, these data for many products plus meas- 
ures of time potentials based on time differences among phenomena were used 
to “explain” geographical variation in the prices of many commodities. Sub- 
stantial amounts of the geographical variations in the prices of certain commodi- 
ties were “explained.” The dimensionless law of price in economics—price 
varies directly with demand and inversely with supply—was expanded to 
include the time and space dimensions of the phenomena. The law, as expanded 
to include space and time dimensions, was stated as follows: Price varies di- 
rectly with Demand Space Potential and Demand Time Potential and inversely 
with Product Supply Space Potential and Product Supply Time Potential. 
It is not the purpose of this paper to redevelop those ideas. Interested readers 
are again referred to the work cited. 

It is submitted that utilization of the space potential technique makes pos- 
sible the measurement of the degree of areal association between certain kinds 
of geographical distributions in cases where such measurement might not other- 
wise easily be carried out. Space potential measurements make possible quanti- 
fication of the “position” factor in economic geography. 





3 This map was developed using a method of mechanical integration based on 48 state control points and 2,304 
separate paired measurements. Populations were assumed to be state-centered hence local variations and the peaks 
of large cities are obscured. For instance, the New York City peak, which is highest of all, is lost, shifted, and merged 
with the high plateau of GEPP in the Middle Atlantic States. Another far more detailed and fine-grained map based 
on several hundred control points is now being prepared by the author in connection with his current research. 
Nevertheless, Figure 1 quite adequately shows the general pattern for the United States. 

4 Discussion, interpretation, analysis, and various applications of this map can be found in W. Warnts, The 
Geography of Price—An Attempt to Develop a Theory of the Geographical Variation of Price Using Concepts Measuring 
the Space and Time Dimensions of Certain E. ic Ph , University of Pennsylvania (1955). (This book is 
now in press.) 








MEASURING SPATIAL ASSOCIATION 
UNITED STATES 


ANNUAL GROSS ECONOMIC POPULATION POTENTIAL 
1940-1949 AVERAGE 


5 


20. 
25 
30, 


P—) 


SD 











S 
IN BILLIONS OF DOLLARS PER 100 MILES 
Fig. 1 


UNITED STATES 


ANNUAL WHEAT SUPPLY SPACE POTENTIAL 
1940-1949 AVERAGE 








IN TENS OF MILLIONS OF BUSHELS PER 100 MILES 
Fia. 2 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


UNITED STATES 
ANNUAL ICE CREAM SUPPLY SPACE POTENTIAL 


1944-1948 AVERAGE 
50 70 90 


50 
50 50 


IN MILLIONS OF GALLONS PER 100 MILES 
Fia. 3 


D. A COEFFICIENT OF SPATIAL ASSOCIATION 


Assume, for example, that the degree of “market space orientation” of the 
production of a given commodity is to be somehow ascertained. An attempt is 
to be made to find the degree of spatial association between the geography of 
production of a commodity and the geography of its ct.asumption, i.e., two 
distinct populations, units of which normaily do not simultaneously occupy 
similar points. 

As previously suggested, one method might be to divide the country up into 
subdivisions. Then the percentage of total production and of total consump- 
tion (or some measure of effective demand) could be assigned to each sub- 
division. A coefficient of correlation could then be viewed as a measure of the 
spatial association, in this case the degree of market orientation of production. 

This method has several obvious faults, one of which has already been dis- 
cussed and is a technical difficulty. Another more subtle difficulty of a theoreti- 
cal nature attributable to the economic aspects of the data should also be recog- 
nized. 

In the first place, as stated, the size of the coefficient would, of course, vary 
with the number and size of the areal subdivisions used as base units. De- 
creasing the size of the unit lessens the likelihood of finding a high degree of 
spatial association of two geographic distributions although the actual distri- 
butions remain unchanged regardless of the system of areal subdivision. If the 
unit area be made small enough, such as could include only the cultivated area 





MEASURING SPATIAL ASSOCIATION 603 


on which the crop was raised or the factory producing the gadget leaving no 
space for anything else, then the coefficient would be meaningless. 

The second objection to a measure so developed stems from the nature of 
the data selected for this example and is associated with a theoretical considera- 
tion in economics. In areal subdivision methods allowance is not made for the 
fact that limited divisibility of certain factors of production and the con- 
sequent economies of large-scale production in certain industries may result in 
certain commodities being produced on a very large scale at a few, sometimes 
widely separated, sites insofar as transportation costs permit. 

The production of large amounts of a product at a few widely separated 
places may represent as high a degree of market-production spatial association 
as does a myriad of very small producers located in such a way as to seem to 
have by inspection a distribution quite like that of the consuming population. 

The space potential measures previously described help overcome both of 
the above-mentioned difficulties. Both Gross Economic Population Potential 
(if it be considered as a measure of accessibility to market) and Product Sup- 
ply Space Potential (if it be considered as a measure of the accessibility to the 
aggregate production of the commodity) vary continuously in space. They, 
therefore, are not subject to the effect of the discontinuity to spatial relation- 
ships that the dividing of data into regions, or areas, or units in a grid imposes 
regardless of whether such a discontinuity be present or not. 

With regard to the second suggested difficulty, supply space potentials are 
measures of proximity at a point to aggregate production whether that total 
production results from a few very large producers or a very large number of 
small producers variously distributed in either case. 

However, in testing to find the degree of spatial association between two 
space potential distributions, it is not simply a matter of matching the two 
sets of equipotential lines for the reason that use of different contour intervals 
on a map results in the bunching or spreading out of contours and an apparent 
(but not real) change in “slope.” A right angle crossing of the two contour sets 
would, however, indicate no correlation. 

It is not just lines on maps which are being compared, but rather the spatial 
coincidence (or lack of it) in the geographical variations in the intensities of 
two spatially continuous variables. 

The test might ideally need to be made for the infinite nu:nber of points in 
the plane. A sample of a given number of randomly selected points might suffice 
with the sample size controlling the safety of the estimate. However, for the 
sake of example and simplicity the 48 control points, on which the maps were 
developed using logical contouring, were used as the units of association. 

For exploratory purposes a ranking method of correlation was used with the 
coefficient of rank correlation (a straight line to the ranks) considered in this 
special case as the Coefficient of Spatial Association of market and production. 

In each case the GEPP data were compared with the Product Supply Space 
Potential data for each of the 48 control points. 

The control points were ranked by value for GEPP and for each of the Prod- 
uct Supply Space Potentials in turn. These data furnished the basis for the com- 
putation of the correlation coefficient. In the terms of data of this particular 





604 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


problem the coefficients found can be considered as indexes of “Market Orienta- 
tion of Production” from the spatial, i.e., geographical stand point. 

The following coefficients of Spatial Association with GEPP were found 
from the data portrayed in Figures 1 and 2 and also Figures 1 and 3: 

Wheat = — .258 
Ice cream = .990 
Other coefficients (for which maps of the Product Supply Space Potentials are 
not included in this paper) have been found to be: 
Potatoes = .767 
Strawberries = .587 
Onions = .395 

These coefficients, once discovered, are surprising to no one knowing the eco- 
nomic geography of the United States. Nonetheless, the concepts and method- 
ology furnish a means for studying more abstruse circumstances using quanti- 
tative data. 

For the purposes of this paper the GEPP map has been used to indicate the 
“market” for each commodity (as it had been used in an earlier work to repre- 
sent the demand force affecting price for each product). 

It is recognized that a more elegant and rigorous treatment might include 
considerations of income elasticity of demand, customs and tastes, prices of 
substitutes, and other topics in demand theory in developing weighting factors 
in the tailoring of the market or demand map for the individual product under 
analysis. It is also recognized that Market-Production Time Association should 
be considered. And, as suggested, this matter has been investigated by the 
author by means of time potentials. Fincings will be presented later. 

It is hoped that this paper has demonstrated some of the properties and 
usefulness of spacing measures in statistical procedures and has also been of 
interest to specialists in economics and economic geography. 





PRACTICAL VALUE OF INTERNATIONAL 
EDUCATIONAL STATISTICS 


GusTAVE ZAKRZEWSKI 
Inter-American Statistical Institute 


INTRODUCTION 


wo recent major international attempts to reach an integrated view of the 

educational situation of the world, namely the United Nations’ Preliminary 
Report on the World Social Situation [1] and the United Nations Educational, 
Scientific and Cultural Organization’s World Survey of Education [2] show 
clearly how great the gaps are in the field of educational statistics, and how 
limited still is the amount of data susceptible of being summarized globally to 
give the makers of economic and social policy a clear understanding of the 
impact of education on economic and social phenomena, and to provide them 
with quantitative illustration of the reasons for and the consequences of their 
actions. 

There are indications that at the present time educational statistics are re- 
ceiving more attention than formerly and most countries now make an effort 
to collect and publish some educational data, but the variability in scope of 
available material is enormous. According to the World Survey of Education, a 
survey of publications containing educational statistics of 53 countries and 
26 territories shows that, except for total enrollment at the primary level, no 
single item was reported by all the countries and territories. Moreover the in- 
comparability of available data reduces considerably their practical value at 
the international level. 

A great deal of study and concentrated action, therefore, is still needed to 
standardize procedures in order to ensure the adequate practical value of the 
data collected. 


PRELIMINARY SUGGESTIONS FOR INTERNATIONAL RECOMMENDATIONS 


Activities directed towards the introduction of certain uniformity in the 
field of educational statistics are being carried on by the United Nations and 
by Unesco. 

The United Nations is concerned with the measure of educational character- 
istics of the population, i.e., literacy, educational level, and school attendance, 
in connection with the population census program, while Unesco is interested 
in the whole system of educational statistics. 

The preliminary suggestions for international recommendations, formulated 
so far by the United Nations [3] and by Unesco [4] in the field of educational 
characteristics and basic educational statistics, are presented below in summary 
form. 


Educational characteristics of the population. The concepts covered by the 
United Nations’ preliminary suggestions are: 


605 





606 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


Literacy—ability both to read and to write a simple message in any one 
language; the ability to read only or to write only, or to write figures and 
own name, should not constitute literacy. 

Level of education—highest level of instruction completed in the country’s 
regular educational system in terms of completed years or grades. 

School attendance—regular school attendance during the last regular school 
month prior to the census. 

The following classifications are suggested: 

Literacy—all population 10 years of age and over; 

Level of education—all persons above the minimum age for normal or compul- 
sory entrance into school, cross-tabulated by age for persons under 20 years 
of age; 

School attendance—age groups coinciding with the ages for which school at- 
tendance is considered compulsory or customary, subdivided according to 
age if data are collected with regard to less formal, specialized or advanced 
types of schools. 

The Unesco preliminary suggestions comprise only one topic in the field of 

educational characteristics of the population, namely literacy: 

Literacy—ability both to read with understanding and to write a short 
simple statement on everyday life; 

Semi-literacy—ability to read with understanding, but not to write a short 
simple statement on everyday life. 

The following classifications are suggested: 

Literacy—by sex and age groups: 10-14 (if the investigation covers the popu- 
lation under 15); 15-19; 20-24; 25-34; 35-44; 45-54; 55-64; 65 and over, 
and, where appropriate, urban-rural; race; nationality; religion or lan- 
guage; social or occupational status. 


Basic educational statistics. The concepts, classifications and tabulations 
covered by the Unesco preliminary suggestions are: 


School age population—population between 5 and 14 years of age inclusive. 

Compulsory school age population—population between the age limits of 
compulsory full-time education. 

School—group of pupils organized as a single educational unit under one or 
more teachers with an immediate head (public school; private school; 
government-aided and non-aided schools). 

Class—group of pupils who are usually instructed together during a school 
term by a teacher, or by various teachers in sequence. 

Grade—a stage on the educational ladder of one school year’s duration. 

Pupil (or student)—a person enrolled for full-time or part-time education 
at any level. 

Teacher—person directly engaged in educating a group of pupils or students. 

The suggested classification by levels and type of study and tabulation is 

summarized in the following table: 

For each type of school: Public, private, government-aided, non-aided 





VALUE OF EDUCATIONAL STATISTICS 607 


Number of 
Level or type of study Schools Classes Teachers Pupils Graduates 


z 
ns 
x 
rj 
x 
ms 


First 
Second 
Third: 
General: 
Lower stage 
Higher stage 
Technical or vocational 
By type of study 
Fourth (a): 
Teacher training 
Special education 
Supplementary education 
Note. First level covers pre-school education; second—elementary (primary); third—secondary (high school)’ 
fourth—higher education. 
x—data required; (a) In addition: number of first year students and those preparing for a diploma; (b) No. of 
faculties; (c) Students obtaining diplomas. 


(b) (c) (ce) 


| eM Me 
mim ttPriadd 
[umm |p dd | ee 
[umm til fene 
ech ble Gah ae 
J [umn n mmm | f 
ae ee | 


Further, the following degree of detail of classification is suggested: 

Teachers—qualified and non-qualified, by sex at the first three levels and in 
special education; by faculty, titular status and sex, at the fourth level. 

Pupils—by level and type of education and either by sex and age, or by sex 
and grade; students at the fourth level by nationality. 

Public expenditure on education—by source of funds, object of expenditure, 
level and type of education. 


SCOPE AND PURPOSE OF EDUCATIONAL STATISTICS 


Before an attempt is made to assess the “practical value” at the international 
level of statistics in the field of education that are produced, or will be pro- 
duced as a result of international recommendations, the scope and the purpose 
of educational statistics must first be determined, and a yardstick must be 
found to make possible the evaluation of the practical value of such statistics. 

Education may be conceived of as a manufacturing process, and people the 
material subject to transformation through education into an economic prod- 
uct, whose value depends on the quantity and kind of “value added” during the 
process of education (transformation). 

Accordingly the population may be divided into three broad groups: 

the population already beyond the educational process; 

the population subject to the educational process; and 

the population that has not yet entered the educational process. 

The scope of educational statistics may be delimited as follows: 

statistics concerning the educational characteristics of the population; 

statistics concerning the educational system, or basic educational statistics; 
and 

statistics concerning the future demand for educational facilities, or ed- 
ucational projections. 

In each group different problems arise. The role of statistics is to provide the 
quantitative measures that can best aid in the solution of these problems. 

In the first case—population that is already beyond the formal educational 
process—the problem consists in determining and evaluating the existing level 





608 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


of education of the population, assessing its value in the light of current needs, 
the progress achieved, and past performance of the existing school system, 
and on this basis to plan the necessary improvements. The role of statistics, 
therefore, is to provide an ac‘equate measure of the “amount” and kind of 
education that was imparted vo this population. 

In the second case—population subject to the process of formal education— 
the problem consists in determining and evaluating the existing educational 
system, its capacity, effectiveness and cost, and on this basis to assess the past 
performance, improve present effectiveness, and plan future developments. The 
role of statistics, therefore, is to provide adequate measures of the capacity, 
type, effectiveness of the educational system, and of the degree to which the 
population makes use of it. 

In the third case—population that has not yet entered the stage of the 
process of formal education—the problem consists in determining and evaluat- 
ing the future pressure on the educational system, and on this basis to plan 
properly future developments. The role of statistics, therefore, is to provide, 
with sufficient anticipation, an adequate measure of future demand for educa- 
tional services. 

Consequently, to be of practical value educational statistics must provide 
elements that give data for the most satisfactory solution of the problems 
faced, be readily comparable between different school systems within a country 
and between countries, and be easily adaptable for analytical purposes. The 
degree to which they are pertinent to the problem, are comparable, and have 
analytical qualities determines therefore the extent to which educational sta- 
tistics have practical value. 


EDUCATIONAL CHARACTERISTICS OF THE POPULATION 


Three categories of statistical measures are applied at present, or are sug- 
gested in the field of educational characteristics of the population: literacy, 
educational level, and school attendance. 

Literacy. Traditionally the degree of education of a population was expressed 
by a negative measure, namely the percentage of population illiterate, i.e., 
population that did not meet a certain minimum standard of literacy, the 
criterion of literacy being the ability to read and write, variously defined. 

In a society where educational facilities are not properly developed the per- 
centage of the population who are illiterate may be considered as a satisfactory 
measure of its cultural development, or as a measure of ability to read and 
write resulting from provision for compulsory education. 

With the extension of educa‘ion, the problem of illiteracy ceased to exist 
or was reduced to insignificant ; roportions in the more advanced countries of 
the world and greatly reduced in other countries. Moreover technical progress, 
by imposing greater educational requirements on an increasing proportion of 
the population, created new problems, requiring new solutions, and the effec- 
tiveness of the illiteracy measure in relation to educational development be- 
came completely inadequate. 

Despite the fact that the measure of literacy still has a relatively large field 
of application, and that its comparability may be greatly improved, its practi- 





VALUE OF EDUCATIONAL STATISTICS 609 


cal value internationally is small and is decreasing, considering that it gives 
no clue to the basic problem and its analytics] qualities are low. 

Educational level. The new measure of the “intensity” of education, or the 
educational level of the population, expressed in terms of the highest grade (or 
year) of school satisfactorily completed, provides a more adequate solution of 
the problem. The amount of education imparted, expressed in terms of the 
grade attained, can be converted into the number of years required to attain 
such grade, and the number of years represents a readily comparable unit, with 
good analytical qualities. The measurement of educational development in 
terms of years of schooling does not take into account the quantitative and 
qualitative variations in the meaning of “one year of school satisfactorily com- 
pleted,” but certain minimum standards of education may be expected to cor- 
respond to certain numbers of years of schooling. Undoubtedly a system of 
weights corresponding to the different educational systems could increase con- 
siderably the comparability and analytical properties of such data. 

The effectiveness of this measure is limited to literate areas, but even in 
areas where large proportions of population did not complete any grade (or 
year) of school, or can not read and write, the measure of the educational level 
attained may complement and increase the analytical value of the measure of 
literacy, by throwing, for instance, some light on the effectiveness of the existing 
school system in assuring permanent literacy (e.g., population illiterate dis- 
tributed by years of schooling completed). ‘The internationally practical value 
of this measure, therefore, can be considered as good and of increasing im- 
portance. 

School attendance. While statistics on the educational level of the population 
concern the population already beyond the formal education, statistics on 
school attendance of the population concern the population subject to the 
educational process, and consequently are closely related to basic educational 
statistics. 

The rates of attendance by age provide a measure of great comparability in 
space and in time. They solve the problem of providing elements for the estima- 
tion of the future level of education of the population and complement basic 
educational statistics. The analytical value of this measure is good and in- 
creases with the degree of detail covered or in correlation with basic educational 
statistics (age-grade distribution of enrollment, by level and type of education). 
The internationally practical value of this measure varies with the degree of 
detail covered and/or the practical value of available basic educational sta- 
tistics. 


BASIC EDUCATIONAL STATISTICS 


The scope, completeness of coverage, classification, and the degree of detail 
of the basic educational statistics at the national level reflect the degree of 
social and economic development of a country, the extent and vitality of its 
educational services, and the level of understanding of the importance of edu- 
cational statistics that are of practical value. Moreover, educational statistics 
collected by individual countries and territories of the world represent indi- 
vidual school systems which grew under the pressure of economic and demo- 





610 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


graphic conditions, social and cultural backgrounds, and the forms of govern- 
ment and political tendencies, all of which vary from country to country. 

Thus, while at the national level the data collected may in many cases aid in 
the solution of particular problems and be of practical value to the country con- 
cerned (in some countries with highly developed statistical services the informa- 
tion possessed is sometimes more detailed than is necessary at the international 
level), certain uniformity in concepts, scope, completeness of coverage, degree 
of detail, and classification used is indispensable to the improvement of the 
international practical value of the data collected. 

The only uniformity possible to achieve up to now at the international level 
seems to be in the field of presentation. Educational statistics are classified 
vertically into levels (pre-school, primary, secondary, higher), and horizontally, 
within the levels, into establishments, teaching staff and enrollment. 

In the vertical classification the levels differ from country to country, and 
sometimes within the same country, by length of schooling, type of education 
imparted, age of admission; in the horizontal classification, the number of 
establishments, the number of teachers and the number of pupils enrolled, 
represent incomparable units as to concepts and coverage. Thus, for instance, 
a one-room, one-teacher rural school cannot be compared with a large metro- 
politan school; primary enrollment in a school system with a 4-year curriculum 
is not comparable with one of 6, 7, or 8 years; teacher or vocational training 
at the secondary level is not comparable with such training at the higher educa- 
tional level; junior college enrollment is not comparable with graduate school 
enrollment, etc. 

This double incomparability reduces considerably the analytical qualities 
of available basic educational statistics; these data only solve the problem of 
securing meaningful comparable statistics by providing summary information 
essential to understanding an individual school system, or establishing general 
trends. Consequently their international practical value is low. 

Even if all countries should adopt a standard school system, and complete 
uniformity could be achieved as to the vertical classification and as to concepts 
of school, teacher, and pupil, the comparability and analytical qualities of the 
respective units within this uniform frame would be affected by their compo- 
sition. Thus, for example, the number of schools would comprise schools of 
different sizes, the number of teachers would comprise teachers of different 
qualifications, and the number of pupils enrolled would comprise different 
distributions by grades and ages. 

This indicates the need for a greater degree of detail and the adoption of 
units of measurement smaller than the global totals. 

Enrollment. The purpose of statistics on school enrollment is to obtain a 
measure of the effectiveness, extent (horizontal and vertical), and develop- 
ment of the school system—how many students attend school at the several 
levels (or in various types of education), what progress they make in school, 
how advanced or retarded they are with respect to the expected grades (or 
levels), how far they proceed up the educational ladder, how many complete 
their studies successfully, and, in correlation with the population data, to 
establish the educational effort of the population, and the proportion of the 





VALUE OF EDUCATIONAL STATISTICS 611 


population making use of the educational services of various levels and types 
(enrollment rates). 

There is one particular group of statistics which seem to solve the problem 
very efficiently, namely the enrollment at each level (and type) of education, 
classified and cross-ciassified by age, grade (or type of study) and sex. The age- 
grade (or type of study) distribution combined in a two-dimensional age-grade 
table provides statistical measures comparable in space and in time and possess 
other good analytical qualities. In the words of the Unesco World Survey of 
Education, “A study of such a table can afford a real insight into the educational 
policy of the country or, perhaps more importantly, into the extent to which 
an established policy has been successful. It has significance in relation to the 
basic problems of rates of advancement through the school, curriculum control, 
teaching standards and the whole framework of school organization. . . . The 
adequate study of expansion and wastage is possible only if distribution figures 
are available over a number of years, so that one may trace successively the 
school life of each 100 pupils entering the first grade” [2, pp. 20 and 52]. 

Correlated with population data, by single years of age, the age-grade dis- 
tribution affords rates of enrollment (or school attendance) by age, grade, and 
type of study. Such data, fitted into the organizational diagram of the school 
system, provide a picture of the educational process, the proportions of the 
population subject to this process at the various levels and in the various types 
of education, and of the symmetry or asymmetry of school enrollment (or at- 
tendance) in relation to the established school organization. 

Data on age-grade (and type of study) distribution may be easily sum- 
marized, according to actual (incomparable) or arbitrary (more comparable) 
levels, as convenient for international purposes. 

Summing up, the age-grade-sex distribution can be considered as having 
high practical value internationally. 

Details concerning students entering the first year of study at each level, or 
in each type of education, and those who have successfully completed the pre- 
scribed course of study, increase still more the analytical value of such enroll- 
ment data. 

Teaching staff. The purpose of statistics concerning the teaching staff is to 
provide a measure of the available teaching staff and its teaching potential. 
The requirements relative to the analytical qualities of such statistics imply 
again the introduction of a greater detail of classification. This requirement 
seems to be better met if teachers are classified by qualifications and by sex, 
while the incomparability at the international level of the concept “teacher,” 
resulting from the varying amount of time spent on teaching by the individual 
(full-time, part-time, teachers working overtime), could be reduced by the 
adoption of the teacher-hour as a unit of measurement. 

Schools. The purpose of obtaining statistics concerning; schools is to provide 
a measure of the capacity of the existing school system. Again in this case, the 
introduction of a greater detail of classification seems advisable. A classification 
of establishments by number of teachers (or teacher-units) and number of 
grades (classes) would increase the comparability and analytical qualities of 
the unit. It seems that the incomparability of the concept “school,” or its sub- 





612 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


division, “class,” could be further reduced and the problem of comparability of 
data better solved by the adoption as a unit of measurement of a standard 
school unit based on the number of school places. 

The cost of education. The purpose of financial statistics in the field of educa- 
tion is to provide data that would allow assessment of present efforts and needs, 
together with determination of the amount of money available for education 
from different sources and the costs of education by object (investment, mainte- 
nance, current expenses) and by type of expenditure (administration, salaries of 
teachers, scholarships, school meals, subsidies to other school systems, etc.) and 
by levels and types of education. The information now available does not 
permit of valid international comparisons. The available figures are based on 
various concepts and in many cases only the budget of the Ministry of Educa- 
tion is being reported. 

The international comparability of data in this field is further affected by 
the incomparability of national accounts expressed in current money terms. 
An adequate degree of detail in reporting should be helpful, and an interna- 
tional conversion unit, computed for such purposes, may reduce considerably 
the incomparability and provide a measure that would allow comparisons be- 
tween countries and, over time, in terms of constant prices. 


EDUCATIONAL PROJECTIONS 


No modern country can plan for its future without some assumption as to 
the size, age composition, and qualifications of the population for which it is 
planning. In the field of education adequate planning for the educational sys- 
tem requires knowledge of the numbers and the regional distributior of the 
children of pre-school, lower and upper school ages. 

Extension of sanitation and epidemic control produces, in most countries 
and territories of the world, an increase of the child population and pressure on 
limited and often inadequate educational resources. Consequently most coun- 
tries are facing the problem of planning for additional facilities to meet an 
anticipated continued increase in the school age population, in addition to 
rising rates of enrollment by age which, at the primary school level, should be 
very close to 100 per cent. 

Educational projections are essential also to the analysis of the broader 
aspects of national economic and social planning, particularly concerning the 
expected quality of future active population and its adaptability to the chang- 
ing requirements of a modern economy. 

The role of statistics, therefore, is to provide estimates of the future volume 
and composition of the school age population. The practical value of educa- 
tional projections is determined by the availability, degree of detail and ac- 
curacy of birth rates, death rates and, where pertinent, of data on migration 
movements, as well as by the basic techniques used for projections to assure 
their international comparability. 


PROBLEMS TO BE SOLVED 


Improvement of the practical value of educational statistics at the interna- 
tional level implies, therefore, the solution of problems concerning the entire 
system of educational statistics and numerous details within the system. 





VALUE OF EDUCATIONAL STATISTICS 613 


The most important prerequisite for an adequate solution is undoubtedly the 
proper identification of the problems faced in each area of educational statistics. 

The problems of a general nature are: the design of a system of educational 
statistics, well integrated within itself as well as consistent and well integrated 
with related statistical fields, and the introduction of such a system at the 
national level. 

The problems concerning details within the system comprise the introduction 
of uniformity of concepts, coverage, time references, degree of detail of classifi- 
cation, tabulation, and methods of collection of the pertinent data. 

An appraisal of the preliminary international recommendations in the light 
of the practical value of the data to be collected brings to light some of the 
problems that still remain to be solved. 

First among these seems to be the problem of designing a well integrated and 
consistent comprehensive system covering the three areas of educational sta- 
tistics. This should be the task of Unesco as the specialized agency in charge of 
problems concerning education. 

The consistency of concepts in the field of educational characteristics of the 
population would be achieved by accepting the concepts of literacy, level of 
education, and school attendance, which are more or less adequate. 

Educational characteristics of the population should not be restricted to the 
census program but should be included in the comprehensive system (such data 
may be obtained by sampling surveys). To ensure the greatest practical value 
of data on educational characteristics the recommendations concerning the 
whole system should emphasize the degree of detail of classification, important 
from the point of view of the analytical value of such statistics, e.g., classifica- 
tion by personal characteristics (sex, age, marital status), by economic char- 
acteristics (occupation, industry, status, activity), and, where appropriate, by 
cultural (language, religion) or other pertinent characteristics. The census re- 
quirements could be more general or suggest several lists of topics, but should 
be consistent with the recommendations concerning the entire system. The 
division of the population into the three broad groups, and the criteria of the 
practical value of the several types of data should be helpful in establishing a 
classification by age and, in particular, in determining the proper age limits to 
which the particular concepts should apply. 

In the field of basic educational statistics, it seems that many problems 
could be solved by departing from the concept of a rigid frame, to which all the 
details must conform, and instead to concentrate on the details and proceed 
towards the summary picture. 

Such an approach would reduce the need for standard international defini- 
tions of concepts and uniform classification of the school system, often difficult 
to accept at the national level. Some of the new problems that the preliminary 
suggestions concerning basic educational statistics could create have already 
been brought to the attention of the members of the International Statistical 
Institute at its 28th session in Rome in 1954. The question was raised as to the 
proposed nomenclature of levels (first, second, etc.), and the proposed classifica- 
tion of teachers’ education. A number of countries have expressed dissatisfac- 
tion with the proposed distinction between government-financed, government- 
aided, and independent schools. [5] 





614 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


It seems that instead of attempting to impose certain uniformity at the na- 
tional levels, efforts should be made to introduce suitable complementary in- 
formation that should accompany national data, so that, in spite of the struc- 
tural differences of school systems, the degree of comparability would be in- 
creased. The complementary information should include such topics as: 

a diagram of the school system, showing the length of study at each level 

and branch of study; 

the time reference to which tabulated data refer; 

the length of the school year (or the duration of the course), including dates 

of beginning and ending the school year; 

definition of data (enrollment, attendance) ; 

minimum age of admission at the various levels (or types of study). 

A further improvement in the practical value of basic educational statistics 
would be insured by greater attention to such factors as the time intervals at 
which data should be collected; the completeness of coverage (public-private) ; 
geographical details (urban-rural) ; distinction between school enrollment and 
school attendance; methods of collection of data; and development of a system 
of weights corresponding to the respective school systems to increase the quali- 
tative comparability of educational statistics. 

In the field of educational projections the main problem seems to be the de- 
velopment of proper standard techniques that would allow the best results and 
an improvement of the basic demographic data, still imperfect in many parts 
of the world. 

The introduction of international recommendations at the national level 
signifies international requirements, and international requirements may create 
difficult procedural problems within a country where the primary data are 
collected. Consequently, to make a statistical system acceptable at the national 
level the information requested should include items of greatest value and 
maximum use to the country concerned. 

It is most probable that if the problems are properly identified, and the data 
requested are of practical value for international comparison and analysis, in 
most cases they should also be of practical value at the national level. Conse- 
quently, the higher the practical value of data requested, the easier it should be 
for the countries to accept them as reasonable, and try to produce them. 


REFERENCES 


[1] United Nations. Ecoromic and Social Council. Social Commission, 8th session. Pre- 
liminary Report on the World Social Situation with Special Reference to Standards of 
Living. (E/CN.5/267/Rev. 1) New York, 1952. 180 p. 

[2] United Nations Educational, Scientific and Cultural Organization. World Survey of 
Education. Handbook of Educational Organization and Statistics. (ED.54.D.6A. Paris, 
1955. 943 p.} 

[3] United Nations. Statistical Office of the United Nations. Population Census Pro- 
gramme. Draft International Recommendations. ST/STAT/P/L.1. 10 May 1955. 

[4] UNESCO. General Conference. Eight Session. Programme and Budget Commission. 
Standardization of Educational Statistics. Preliminary Study of the Technical and Legal 
Aspects. Annex III. 8C/PRG/2. Paris. 9 July 1954. 

[5] UNESCO. Problems on the Standardization of Certain Aspects of Educational Statistics. 
International Statistical Institute 28 Session. Bulletin de |’Institut International de 
Statistique. Tome XXXIV—3-éme Livraison. Rome 1954. 





REGRESSION TECHNIQUES APPLIED TO SEASONAL COR- 
RECTIONS AND ADJUSTMENTS FOR CALENDAR SHIFTS* 


Harry E1senpresst 
National Bureau of Economic Research 


HE variability of the calendar from year to year is a disturbing factor in 

the calculation of seasonal adjustments of time series. The number of 
Saturdays and the number of Sundays in a given month, say, January, changes 
from year to year; a holiday which is fixed in terms of its date in the month, 
e.g., Independence Day, can occur on a different day of the week in successive 
years; Easter can fall as early as March 22 and as late as April 25—any of 
these factors can distort the usual seasonal measures and make it necessary to 
use special corrections. Devices such as working-day adjustments and Easter 
corrections are the well-known solutions of these problems and have been used 
successfully for many years. The present note suggests a general method of 
handling such calendar shifts, in conjunction with seasonal adjustment, by 
means of regression techniques, and illustrates the method with the seasonal 
correction of bank debits outside New York City for 1946-54, and of tobacco 
manufactures for 1947-53. 

Regression techniques are no novelty in seasonal computations. Horst 
Mendershausen carried out a thoroughgoing regression analysis of construction 
employment seasonals in 1939.'! The expensiveness of such intensive analysis 
for usual seasonal computations has prevented iis application from becoming 
widespread. However, where a seasonal is fairly stable except for calendar 
shifts, the use of regressions to correct for such shifts is simple and not very 
costly, and is often preferable to alternative methods. 

The method used may be described as follows: To correct a time series for 
regular seasonal variations and calendar shifts simultaneously, compute ratios 
to the 12-month moving average. Adjust the ratios for each year so that their 
sum is 1200. For each month, from January through December, compute the 
regression of the adjusted ratios, y, on the calendur-shift variables, x, 2, etc. 
For example, x; may be taken as the number of Saturdays in the month; z2, the 
number of Sundays in the month; and 2; (for February), the number of days 
in the month (to take care of the leap year problem). The regression values of 
the ratios are the seasonal adjustment factors, combining the usual stable 
seasonal adjustment with a correction for calendar shifts. The original series is 
divided by the series of seasonal adjustment factors, month by month, to ob- 
tain the seasonally-adjusted series. 





* The author wishes to acknowledge his indebtedness to Millard Hastay for suggesting the seasonal methods 
described below, and to Charlotte Boschan for intelligent assistance in all phases of the work. 

+ At present with Remington Rand Univac Division of Sperry Rand Corporation. 

1 See his article, “Eliminating changing seasonals by multiple regression analysis,” Review of Economic Statistics, 
Nov. 1939, 171-77. See also Julius Shiskin, “A new multiplicative index,” Journal of the American Statistical Associa- 
tion, December 1942, 507-8, and the references cited in this article. 


615 





616 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


Cuart 1 


Banx Desrts Outsipzn New Yorx Cry, 1946-54, wiTH AND wiTHouT WoRKING 
Day ApsusTMENTS, SEASONAL ADJUSTMENTS BY MovinG-AVERAGE MeTHop 


Original observations 
e Ny-adjusted data 





Piiions of dolar A. Without Adjustment for Number of Working Days 





100; 


90 


80r+- 


70 


' 
































PRUCTURTICUSCVSEUDEUSEVSCNUSTOLTUSTS CVUETECTUCTOOUEIVE CVUSUOCUUETO CTU SVS CWSOVGETORWSOVSNVO STUNTS OVI OTD OVI EET 


1946 1947 1948 1949 1950 1951 1952 1953 1954 
, 8 Adjusted for Number of Working Days 





3 lions of dollars per working da: 



































PUURUSEUUETSCUUEWULNTEVS FUDEVUCUUENS CVECTO CONVO CVUETS EVUETO EVUNTS FUTERDCNUNTORTOTETORUETOCUINTE EVENTS OVEETS) 


1946 1947 1948 1949 1950 1951 1952 1953 1954 
Ratio scale 





For ease of computation z, may be taken as the number of Saturdays in the 
month less 4, z3 as the number of days in February less 28, etc. I+ is thus usually 
possible to reduce the calendar-shift variables to (0, 1) variables and to simplify 
the computation considerably. 

As the first application of this method we take the series of bank debits out- 
side New York City, for the period 1946-54. Since this series is a broad aggre- 
gate, its movements, after adjustment for seasonal variations, might be ex- 
pected to be fairly smooth. Yet correction by current methods? yields a series 
that has many small choppy movements (Chart 1, A). Prior adjustment for the 
number of working days in each month does not improve the smoothness of 





2 The method used in Chart 1 for seasonal adjustment is that of the Bureau of the Census. For a description 
of this method see Julius Shiskin, “Seasonal computations on Univac,” The American Statistician, February 1955, 
19-23. 





SEASONAL CORRECTIONS 


Cuart 2 


Bank Desitrs Oursipz New Yor« Ciry, 1946-54 
SEASONAL ADJUSTMENT BY REGRESSION METHOD 





Billions of dollars 
110 


100}- 


90} 


Original observations 
9 —~/ 


80 4 


4 

' 

Ma al 
VP 


70 


Ratio scale 


PEUEUS EVUEWS CUREUD EVORTO CUO IES EVURTS CUO EUS CUUETT OWS EVO CWE LTS OVE OTE CUUEWEFUUCTS EVE NUS FEU ETECES EVE CTEOTS FED ET 


list dost Jul 
1946 1947 1948 1949 1950 1951 1952 1953 1954 



































the seasonally-adjusted series significantly (Chart 1, B). Moreover, the use of a 
working-day adjustment for series other than production data may not be 
entirely justifiable.* 

In Chart 2 the regression method is applied to this series; the resulting adjust- 
ment is much smoother than the corresponding series in Chart 1. Further im- 
provement might be expected if a longer period could be used (provided the 
underlying seasonal pattern could be assumed to remain stable over such a 
period). The method fails to give an adequate correction for leap-year Feb- 
ruaries: Of the two leap-year cases in our data, the first (1948) occurs in con- 
junction with a 5-Sunday month, while the second (1952) has only 4 Saturdays 
and 4 Sundays. It is obviously incorrect to average the ratios of these two years 
to get an estimate of the leap-year February factor. The working-day adjust- 
ment method was therefore used for this month. To get an estimate of the 
effect of 5 Sundays (or 5 Saturdays) in February a period of at least 24 years 
would be needed, since that is the frequency of occurrence of such a month. 
Table 1 lists the regression equations for all the months, and Table 2 gives the 
standard errors of the regression coefficients. 

The positive signs of the regression coefficients of z, in July and September 
are rather puzzling. The explanation may be somewhat as follows: In both of 
these months, a 5-Saturday month must start near the end of the week, and 
thus may tend to lengthen the holiday period that occurs near the beginning 
of these months. This may lead to an increase in checking and debit activity 
for vacation expenditures, rather than the normal decrease associated with 
5-Saturday months at other seasons. This hypothesis needs further investiga- 
tion. 





3 Cf. Simon Kuznets’ remarks on the use of working-day adjustments in his Seasonal Variations in Industry 
and Trade (New York, 1933), p. 25, footnote 3, especially the following statement: “In many economic processes 
it is difficult to assume that volume of activity is directly proportional to the number of working days (for ex- 
ample, bank clearings or retail sales).” 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


TABLE 1 


BANK DEBITS OUTSIDE NEW YORK CITY, 1946-54 
MONTHLY REGRESSIONS ON CALENDAR-SHIFT VARIABLES 


(y=ratio to 12-month moving total'; 2; =number of Saturdays in month, 
minus 4; z;=number of Sundays in month, minus 4) 


Month Regression Equation 

January y =8.764 — .0912, — .31822 
February* y =7.495 
March y =8 .858 — .0332; — . 13322 
April y =8.162—.17521 — .17522 
May y =8.186 — .0642; — .1432,2 
June y =8.527 — .2362; — . 28222 
July y =8.300+ .100z; — .400z2 
August y =8.321 — . 1832, — .38322 
September y =8.250+ .0622: — . 5382 
October y =8.804 — .3232; — .02322 
November y =8.350 — .0502; — .350z2 
December y =9.236 — .1822 — . 1822, 

1 Since moving totals were used instead of moving averages, the regression values of the ratios must be multi- 

plied by 12. 
* See text. 


TABLE 2 


STANDARD ERRORS OF REGRESSION COEFFICIENTS 
SHOWN IN TABLE 1 


Standard error of 





Month Constant Coefficient Coefficient 
term of m1 of 2 


January .086 . 133 .149 
February .056 

March .089 .134 .134 
April .080 .161 .161 
May .080 .110 .113 
June .098 .170 .152 
July . 102 .177 .177 
August .160 .241 .241 
September .112 .225 . 225 
October .100 .174 .174 
November .098 .170 .170 
December .162 . 280 . 280 


In an attempt to improve upon the results, the regressions were recomputed 
with the following additional variables: In the months of January, February, 
May, July, October, November, and December, a holiday variable, 23, was 
added, having the value “0” if the holiday fell on Saturday, and the value “1” 
otherwise. In March and in April the variable z; was the Easter variable, having 
the value “0” if Easter fell in March and n if Easter fell on the nth day of April. 
The results obtained were almost indistinguishable from those presented above. 





SEASONAL CORRECTIONS 


Cuart 3 


Tosacco Propuction, 1947-53 
CoMPARISON OF FEDERAL RESERVE SEASONAL ADJUSTMENT WITH REGRESSION 
SEASONAL ADJUSTMENT 


Original observations 
Seasonally-adjusted data 


index(1947-49=100) A. Federal Reserve Adjustment 
130 


120 





110 KA 


' 
100 Nw 


\ 

\ 

‘ 

end H 

! 
90 


4 


< 
yu 
<<... 
ra 
->-"* 


23---- 























UETE CUUEWE ETECTE CUURUUOWECUUERUCTRCUORUECUERUOCURRUTCRUOURCUTOURRUUCTORTERTEOUD ET 


1947 1948 1949 1950 1951 1952 1953 
B. Regression Method 
































phorbistertortortortistirtertiates beans eee MRO ere 
1947 1948 1949 1950 1951 1952 1953 





Ratio scale 


The second application of the regression method is to the Federal Reserve 
series on tobacco manufactures. In Chart 3, section A shows the series as cur- 
rently published, with a working-day adjustment applied to the data before 
seasonal correction. In section B the regression method is applied to the data 
without any working-day adjustment. The regression results must be considered 
tentative, since the period 1947—53 is not long enough to yield reliable results 
of several of the z’s, especially the holiday variables. Nevertheless, the regres- 
sion correction in B compares very favorably with the moving seasonal correc- 
tion in A. If a 10-12 year period could be used, the regression method would 
perhaps prove to be the better method, provided the underlying seasonal were 
sufficiently stable over such a period. 

The advantages of the regression method over current methods are, first, that 
estimates of the effects of calendar shifts are obtained from the data directly 
and do not have to be postulated in advance. For example, it is not necessary 





620 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


to determine the weight to be assigned to a week-end holiday, as is done in 
making a working-uay adjustment of a production series. The weight is auto- 
matically obtained as a regression coefficient; furthermore, it is free to vary 
frora one holiday to the next. 

Second, in estimating the effects of calendar shifts and stable seasonal factors 
together, we make allowance for the presence of interaction between these two 
elements of seasonality. The sequential estimation of these elements implies 
that their interaction is zero; this assumption may not be altogether warranted. 

One shortcoming of the method lies in the fact that it cannot be combined 
with current moving seasonal methods. It yields smooth adjustments for a 
period of 9-12 years over which the basic seasonal movements are essentially 
stable; for shorter series its results must be considered tentative. It can be 
applied to long series by either breaking the data up into shorter homogeneous 
periods of 9-12 years each, or by applying the regression as a “moving” opera- 
tion to estimate the parameters of the central year of a 12-year period, using 
overlapping periods. 











THE RANKING OF VARIANCES IN 
NORMAL POPULATIONS* 


H. A. Davip 
University of Melbourne 


Two procedures for ranking variances of norma] populations are con- 
sidered. The first is essentially a gap-test and effects a separation of the 
variances into two groups whenever the ratio of successive mean squares, 
arranged in ascending order of magnitude, exceeds a critical tabulated 
value. The second method is a more elaborate multiple decision proce- 
dure based on the distribution of the maximum F-ratio. 


1. INTRODUCTION 


on v degrees of freedom, of the variances o?, of k normal populations. The 
equality of the variances may be tested either by Bartlett’s modification of 
the Neyman-Pearson L,-test or by the maximum F-ratio criterion (Hartley 
[4], David [2]). But statistical significance on these over-all tests leaves open 
the question as to which variances are different from which. It is the purpose 
of this paper to deal with two approaches to this problem. 

With variances, unlike means, an elaborate procedure will often be unnec- 
essary. However, a point which seems of some interest is whether the k vari- 
ances can be placed into two or more distinct groups. To this end the first 
method to be considered is an adaptation of the gap-test suggested by Tukey 
[6] in the case of means. Let s*,4 be the tth mean square in order of magnitude, 
and o*;, the variance corresponding to s*,»). Then, whenever any of the ratios 


a we have k mean square estimates s*,(t=1, 2, -- +, k), each based 


87 04741)/87 4 (t’ =1, 2, - - - , K—1) exceeds a critical tabulated value we declare 
that on}, a7 10, . oe O78) and ote 411, o71449, es ty oO tR) belong to different 
groups. 


The second and more elaborate method gives an ordering of the variances 
into overlapping classes. Based on the distribution of the maximum F-ratio 
it follows naturally from the procedure for means recently proposed by Dun- 
can [3]. 

The technique for carrying out the two tests is illustrated by an example in 
the following section. 

Another question briefly considered is the use of the Fngx-ratio in the setting 
up of confidence intervals for the ratios of the oq. 

Although not referred to in the sequel the work of Bechhofer and Sobel [1] 
should be mentioned here. These authors deal with the problem in the design 
of an experiment of how large sample sizes must be to give a pre-assigned 
probability of a correct ranking under specified conditions on certain of the 
population variance ratios. 





*I am indebted to B. C. Halliburton of C.S.1.R.0., Prospect, N.S.W., for carrying out the heavy computa- 
tions underlying Table II, and to Betty Laby of the University of Melbourne for assistance with Table III. My 
thanks are also due to a referee for a number of valuable comments. 


621 











622 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


2. AN EXAMPLE 


It is required to compare five machines for their suitability in producing a 
certain dimension on a new product as uniformly as possible. Mean square 
estimates of variance, based on ten articles from each machine, were found (on 
re-arranging) to be as follows: 


Machine A B Cc D E 
3? 2.8 9.4 33.2 22.5 28.2 


The null-hypothesis of equal variances can be rejected at the 5% level of 
significance since 8*,5)/s?q)=28.2/2.8 exceeds the tabulated 5% point, 7.11 
(k=5, v=Q), of the maximum F-ratio (see [2] or [5]). 

Turning to Table II we see that of the four ratios 8°,4-41)/8?,) only 8*)/s*a) 
is significantly large at the 5% level. Machine A may thus be regarded as giving 
superior reproducibility, no other clear separation between the machines being 
indicated by the test. 

If a more extensive ranking is wanted the multiple Fn,.-ratio test may be 
used in the following manner (cf. Duncan [3]). Let C(k, v) denote a typical 
entry in Table III. Form the ratio s*)/s*q). Since this is the maximum ratio 
obtainable from the five mean squares, compare it with C(5, 9), which it 
exceeds. Hence declare that o?4<o’g. Next, test s*,5)/s*2) against C(4, 9). As 
this does not turn out to be significant do not form s*)/s*,g) and 87.)/s*q). 
Repeat the whole process, working from right to left and starting with s*.). 
This is found to differ significantly from s*q), by comparison with C(4, 9), but 
not from s*. Finally, 8%) does not differ significantly from s*q), which com- 
»letes the procedure. The results may be summed up by suitable underlining 
thus: 


2.8 9.4 11.7 22.5 28.2 








where any two mean squares are significantly different only if they are not 
underscored by the same line. 
3. TUKEY'S GAP-THEORY 


Let x, 22, - + + , 2 be k variates with common cumulative distribution F(z). 
There will be k—1 “gaps” of lengths 2,441) —2)(t’=1, 2, - - - , k—1) between 
the variates arranged in ascending order of magnitude. Tukey [6] has shown 
that the expected number p, of gaps per sample, which exceed a specified 
length G is 


nak f {[P@) +1-Fe+@}-[F@jar@ 
and that for a symmetrical parent population this takes the form 
—\G 
p= 2k f (Fa) +1- Fe + @]ar@) - BR-IM — @) 


Thus given a set of k observations with common known variance, say unity, 
but possibly different means we may declare all gaps greater than G to be 








RANKING OF VARIANCES 623 


significantly large. On the null-hypothesis of equal means this will lead to p, 
incorrect statements per sample. 

Table I gives for k=2 to 12 the values of G making p, equal to 0.05 and 0.01, 
when the parent distribution is unit normal. These critical levels, G., were ob- 
tained by quadrature applied to equation (2) for selected values of G, and sub- 
sequent inverse interpolation. The G, are, of course, not percentage points but 
they will in fact be good first approximations to the 100e% points of the largest 
gap, being identical with the latter for k=2 and underestimates for k>2. 


TABLE I 


CRITICAL 1000% LEVELS OF THE k—1 DIFFERENCES, 20/41) —2, 
IN A SET OF k RANKED UNIT NORMAL VARIATES 


k 2 3 4 5 6 7 8 9 10 11 12 
a@=0.05 2.77 2.48 2.28 2.13 2.02 1.93 1.86 1.81 1.76 1.72 1.69 
a=0.01 3.64 3.19 2.90 2.71 2.57 2.47 2.39 2.33 2.27 2.23 2.19 


4. APPLICATION TO THE ORDERING OF VARIANCES 


In applying the foregoing theory to the case of k mean squares calculated 
from normal samples and based on » degrees of freedom we may take 


x, = log s,’. 
An equivalent, but slightly more convenient, procedure is to consider the 
magnitudes of the k—1 ratios S8*,41)/S*,) and to effect a sub-division whenever 
any of these exceeds a critical value R.(v). This quantity is tabulated in Table 
II for the above values of k and a sufficient coverage of v-values (except for the 


computationally difficult case »=3). 
From (1) it may be seen that R, is the solution of the equation 


am kf { (FG) +1 - PR) — [r@))4}ar() (3) 


where F(z) is the cumulative distribution of x? with » degrees of freedom. The 
methods used to solve for R. follow the lines employed by Hartley [4] and 
David [2] in finding the upper percentage points of the maximum F-ratio. For 
v=2 an “exact” solution is possible by iteration, since we have in this case 


1 
a=kf (-ytyyray =i 
0 


k—1 k- 
-k>( *) Bok = 4, B+ 1) 


= kD {tll + RN + R)---(k—t+ RO}-, 
tal 


so that the (i+1)th approximation R;,:(a) can be expressed in terms of R;(a) by 
ki & R; 


ie : 
, ae PRD RIFE 








624 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


TABLE II 


CRITICAL 100e% LEVELS OF THE k—1 RATIOS, #(-41)m8%e, IN A SET 
OF k RANKED MEAN SQUARES, EACH BASED ON » DEGREES 
OF FREEDOM. (NORMAL VARIATION ASSUMED) 


k 
he. 2 3 4 3 6 7 8 9 10 11 12 
, 


(a) 5% levels 
2 39.0 32.9 28.8 26.7 25.4 24.6 24.1 23.7 23.4 22.1 22.9 
4 9.60 8.08 7.24 6.69 6.33 6.08 5.92 5.80 5.70 5.63 5.58 
5 7.15 6.09 5.48 5.07 4.81 4.62 4.50 4.40 4.33 4.27 4.22 
6 5.82 5.01 4.54 4.21 3.99 3.84 3.74 3.66 3.60 3.54 3.49 
7 4.99 4.34 3.95 3.67 3.49 3.37 3.29 3.21 3.15 3.10 3.06 
8 4.43 3.88 3.55 3.31 3.15 3.04 2.96 2.90 2.85 2.81 2.77 
4 4.03 3.55 3.26 3.05 2.90 2.80 2.74 2.68 2.64 2.60 2.56 
10 3.72 3.30 3.03 2.85 2.72 2.63 2.57 2.51 2.47 2.44 2.41 
12 3.28 2.04 2.72 2.56 2.45 2.38 2.32 2.28 2.24 2.21 2.19 
15 2.86 2.59 2.42 2.29 2.21 2.14 2.09 2.06 2.03 2.01 1.98 
20 2.46 2.26 2.12 2.03 1.296 1.91 1.87 1.85 1.83 1.81 1.79 
30 2.07 1.93 1.84 1.77 1.72 1.68 1.65 1.63 1.62 1.60 1.59 
60 1.67 1.59 1.53 1.49 1.46 1.44 1.42 1.40 1.39 1.39 1.38 
* 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 
(b) 1% levels 

2 199 153 135 126 121 118 115 114 112 111 110 

- 23.2 18.1 15.9 14.8 14.0 13.6 13.2 13.0 12.8 12.6 12.5 
5 14.9 11.9 10.4 9.68 9.21 8.87 8.64 8.47 8.33 8.21 8.11 
6 11.1 8.86 7.83 7.27 6.93 6.69 6.52 6.38 6.26 6.15 6.06 
7 8.89 7.18 6.35 5.90 5.62 5.44 5.32 5.21 5.12 5.03 4.96 
8 7.50 6.11 5.42 5.05 4.82 4.66 4.55 4.47 4.40 4.34 4.28 
v 6.54 5.40 4.82 4.49 4.29 4.15 4.06 3.99 3.93 3.88 3.83 
10 5.85 4.85 4.34 4.06 3.88 3.75 3.87 3.60 3.55 3.50 3.46 
12 4.91 4.14 3.73 3.49 3.34 3.24 3.17 3.11 3.07 3.03 3.00 
15 4.07 3.49 3.17 2.99 2.87 2.78 2.72 2.68 2.64 2.61 2.59 
20 3.32 2.91 2.67 2.53 2.43 2.36 2.32 2.28 2.26 2.24 2.22 
30 2.63 2.36 2.20 2.10 2.03 1.98 1.94 1.92 1.90 1.89 1.88 
60 1.96 1.82 1.72 1.67 1.63 1.60 1.58 1.56 1.55 1.54 1.54 
od 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 


For k=2, R.(v) is simply the upper 50a% point of the F-distribution with 
v,v degrees of freedom. 

The remainder of Table II was built up from a framework obtained by 
quadrature for the cases y=4, 6 and 12, with k even. It was possible to fill in 
the remaining entries by interpolation, using as an auxiliary function the 
approximation 


R.(v) = exp {Gav |2/(» — 1)]}. 
5. CONFIDENCE STATEMENTS BASED ON THE 
MAXIMUM F-RATIO DISTRIBUTION 


Percentage points, Pnax (a), 0° the maximum F-ratio have been constructed 
so that, on the null-hypothesis of equal population variances, we have 


Pr(8?()/8?a) < Fmax(a)) = 1—a (4) 


or, equivalently, if s*;, s?; are randomly chosen, 
Pr(1/Fmax(a) < 87;/8?; < Finax(a)) = lL — (all ¢, 7). (4’) 














RANKING OF VARIANCES 625 


It follows that if, in fact, the null-hypothesis is not true and o%;, 9°; are the 
variances corresponding to s*;, s?; then the joint confidence statement 
8?; o°; Prnax(a)8*; ‘ 
XK XK (all 4, 7) (5) 
8? F mex(@) a; 8°; 
holds with probability 1—a. 
If a confidence statement about o°nax/o%min is wanted, a little care is necessary 
since it cannot be assumed that o*nsx corresponds to s*). Consequently it 


follows from (5) that 
( 8" (e) O nex foe) 
874)Fmax(@)  o*min 874) 


(6) 


is at least equal to 1—a. 


Ezample. For the case considered in Section 2 we can by means of (5) make a 
joint 95% confidence statement about the ten variance ratios involved, a typi- 
cal statement being 

22.5 o 22.5 


1 
— X —— < — < 7.11 x —- 
7.11°° 9.4 ~ os 9.4 


Again, for o%max/o%min (6) gives the interval 
( 1 28.2 28.2 


——- XK ==» 7,16 KX ——1} = (1.2 7.16 
7.11°° 2.8 a) ( 


which has a confidence coefficient of at least 0.95. 


6. A MULTIPLE TEST FOR THE RANKING OF VARIANCES, 
BASED ON THE MAXIMUM F-RATIO DISTRIBUTION 


In a recent paper Duncan [3] has given a new multiple range test for the 
ranking of means in an analysis of variance. His approach is easily applied to 
the ranking of variances. Indeed, the latter is in principle an easier problem 
since no studentization is necessary for the comparison of sample variances, 
as it is in the case of means. The basic distribution used by Duncan in the 
construction of the necessary tables is that of the studentized range, corre- 
sponding to which our Table III is based on the distribution of the maximum 
F-ratio. Thus the typical entry C.(», k) is the upper 100[1—(1—a)*"]% 
point of s*nax/8%nin in the case of k mean squares, each based on » D.F. For 
a=0.01 it was possible to obtain the table from a slight extension of a set of 
selected values of the probability integral of s%nax/8*min, previously prepared 
by the writer in connection with tables of upper 5 and 1% points of this ratio. 

The use of Table III has been exemplified in Section 2. It will be noted from 
the definition of C.(v, k) that there is an ordinary 1% significance level when 
only two mean squares are compared, and that this level is progressively relaxed 
as more mean squares are taken into consideration. This is a basic feature of 
Duncan’s test and is discussed in detail in his paper. We confine ourselves to 
pointing out that in the present context Duncan’s arguments apply even more 
forcibly in view of the absence of studentization and the consequent inde- 
pendence of the quantities, s*;, to be ranked. 








626 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


TABLE III 


SIGNIFICANT MAXIMUM F-RATIOS IN A 1% LEVEL MULTIPLE 
Paa-RATIO TEST. (NORMAL VARIATION ASSUMED) 


k 
2 3 4 5 6 7 8 
, 


9 10 11 12 
2 199 224 243 258 272 283 204 303 311 319 326 
3 47.5 52.9 57.0 60.2 62.9 65.4 67.5 69.1 70.7 72.2 73.7 
4 23.2 25.5 27.3 28.7 29.8 30.9 31.8 32.5 33.2 33.9 34.5 
5 14.9 16.4 17.4 18.2 18.9 19.4 19.¢ 20.4 20.8 21.1 21.5 
6 11.1 12.1 12.8 13.3 13.8 14.1 14.5 14.7 15.0 15.3 15.5 
7 8.89 9.64 10.2 10.6 10.9 11.2 11.4 11.6 11.8 12.0 12.2 
8 7.50 8.10 8.52 8.84 9.08 9.29 9.49 9.65 9.81 9.94 10.1 
9 6.54 7.04 7.38 7.64 7.84 8.03 8.18 8.31 8.44 8.55 8.65 
10 5.85 6.27 6.55 6.78 6.95 7.11 7.24 7.35 7.45 7.55 7.64 
12 4.91 5.22 5.45 5.62 5.75 5.87 5.97 6.06 6.13 6.21 6.27 
15 4.07 4.31 4.47 4.60 4.69 4.78 4.86 4.92 4.98 5.03 5.10 
20 3.32 3.48 3.60 3.68 3.75 3.82 3.87 3.91 3.95 3.99 4.02 
30 2.63 2.73 2.81 2.86 2.91 2.95 2.98 3.01 3.03 3.06 3.08 
60 1.96 2.02 2.05 2.09 2.11 2.13 2.14 2.16 2.17 2.18 2.19 
” 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 


7. DISCUSSION 


The respective advantages of the two methods presented have already been 
mentioned. The first is simpler and more suitable when only a clear separation 
of the variances into distinct groups is wanted. Of course, the second procedure 
may also lead to such a clear separation but is, in general, less likely to do so, 
as may be seen from a comparison of Tables IIb and III: For k=2 the two 
methods are identical but for k>2 one of the conditions that a clear separation 
of o%1) and o* 41; should be effected by the second method is that the ratio 
87 441) /8%) must exceed C(2, v). This condition becomes more stringent with 
increasing k as the spacing of the sample variances will become denser on the 
null-hypothesis. Allowance for this effect is made in the first method. 

Finally, it should be remembered that even when the population variances 
are all different s*,,. does not necessarily correspond to the tth largest of them. 
Thus any decision as to ranking involves a risk of error apart from that of in- 
correct rejection of the null hypothesis of equal variances. However, control of 
the latter risk will ensure also control of gross errors in ranking, i.e. declaring 
one mean square significantly larger than another when in fact the correspond- 
ing population variances are in the reverse order. 


REFERENCES 


[1] Bechhofer, R. E. and Sobel, M., “A single sample multiple decision procedure for rank- 
ing variances of normal populations,” Annals of Mathematical Statistics, 25 (1954), 
273-89. 


[2] David, H. A., “Upper 5 and 1% points of the maximum F-ratio,” Biometrika, 39 (1952), 
422-24. 

[3] Duncan. D. B., “Multiple range and multiple F tests,” Biometrics, 11 (1955), 1-42. 

[4] Hartley, H. O., “The maximum /-ratio as a short cut test for heterogeneity of vari- 
ance,” Biometrika, 37 (1950), 308-12. 

[5] Pearson, E. 8., and Hartley, H. O., Biometrika Tables for Statisticians, Vol. 1, Cam- 

f2 bridge University Press (1954), Table 31. 

(6) Tukey, J. W., “Comparing individual means in the analysis of variance,” Biometrics, 
5 (1949), 99-114. 





THE CONDITION FOR LOT SIZE PRODUCTION* 
Myron J. Gorpon 
Massachusetts Institute of Technology 
AND 
WiuiaMm J. Tayior 
Ohio State University 
I. STATEMENT OF THE PROBLEM 


HE Wilson formula [6] for determining the optimum lot size, the quantity 
Tis which an item of inventory should be purchased or produced for stock is 


q= /— (1) 


where y=annual sales, h=set-up cost, and s=the cost of carrying a unit (the 
required return on investment, space, handling, taxes, etc.) for one year. This 
formula follows from the propositions, (a) the optimum quantity minimizes 
the sum of the annual set-up and carrying costs, and (b) these costs are 


h 
TVG iw iter, (2) 


q 2 


Although Eq. (2) is not an exact statement of inventory costs even when sales 
are of unit size and take place at a constant rate, it has been shown [5] that 
under these conditions Eq. (1) is correct. 

However, for wholesalers and manufacturers an item for which sales are of 
unit size and at a constant rate is the exception. When sales vary in size, take 
place irregularly, and the time and size of sales are uncertain, Eq. (2) is not a 
correct statement of the annual inventory costs, and the validity of Eq. (1) is 
suspect. We know that when sales are very infrequent and/or vary over a wide 
range in size, a firm will buy or produce the item only to fill orders rather than 
carry a stock. The purpose of the present paper is to construct a model which 
tells whether it is more profitable to produce an item for stock or to produce it 
only as sales are made, when the frequency and size of sales are uncertain. 

The solution to the problem presented here does not establish the optimum 
lot size in the case where production for stock is indicated. The reason for 
establishing only the condition for lot size production therefore deserves ex- 
planation. In another manuscript the writers present a formula for the optimum 
lot size when the time and size of sales are uncertain. As expected, under certain 
values for the parameters of the model the indicated course of action is produc- 
tion to order. However, the model and mathematical analysis necessary to 
establish the optimum lot size involve an infinite sequence of lots, and are more 





* The research for this paper was supported in part by the Sloan Research Fund of the School of Industria] 
Management, Massachusetts Institute of Technology. William J. Taylor is largely responsible for the mathematical 
methods used in the paper. The authors benefited from the advice of G. Dannerstedt, T. M. Whitin, and the 
referee on earlier drafts of the paper. 


627 





628 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


elaborate than necessary to establish the condition for lot production alone. 
The latter can be obtained by consideration of a simpler single lot model, and 
it seems worthwhile to present this analysis both for the value of the result ob- 
tained and as an introduction to some of the methods used in the more detailed 
analysis to follow. It may also be noted that the numerical evaluation of the 
condition for lot size production proves to be a simple computation, so that it 
is likely to be of practical value to industry. 


Il, DESCRIZTION OF THE MODEL 
a. The Gain or Loss on Lot Size Production 


At time ¢=0 an order is received, and some quantity mn» of the item must be 
produced to fill the order. The firm can simply produce the np» units, or it can 
produce 7» plus a lot or over run of n units in excess of no. In the latter event 
the n units will be shipped out on subsequent sales as they are made. Let the 
times of these subsequent sales be denoted by 


h, tr, o + 6, t, oF & bun, 
and let the size of these sales be denoted by 
nN, Na, “ee ’ ni, =. + 9 Thm. 


The sale at ¢,, is taken as the residual stock at that time rather than the actual 
order, so that the following equality holds. 


Lm = m+ mbes tats +s + itm = MN. (3) 
t=] 

If the firm produces an over run of n units at ¢) it will incur the cost of carry- 
ing the n units until they are sold. The carrying cost is composed of the out-of- 
pocket storage cost (space, handling, taxes, insurance, etc.) and of the foregone 
return on the funds tied up. Let 6’dt be the cost of storing a unit of the item 
for the period dt, c be the cost to manufacture a unit of the item, and k be the 
required rate of return on investment. Then, the cost of carrying a unit of the 
item for the period dt is 


Bdt = (ck + 6’)dt. (4) 


Discounted to the time of production, to, the total accumulated cost of carrying 
a unit of the item until the time of its sale, ¢;, is 


te B 
f e~**gdt = — [1 — e-**]. (5) 
0 k 


By discounting, costs incurred at different points in time are converted to their 
equivalent value at one point in time and thereby made comparable (see [1]). 
Therefore, the cost of carrying n units produced at ¢, sold in the sales 
M1, M2, ***, Ni, ***, Nm, and valued as of fo is 


Dm [1 — e-**], (6) 








LOT SIZE PRODUCTION 629 


The advantage in producing the n units is the future set-up costs thereby 
avoided. If the n units are exactly exhausted by m sales, a set-up cost at the 
time of each of the m sales is avoided by an over run. If the mth sale exceeds 
the residual stock at t,, a set-up will be required at ¢,, and the firm only avoids 
a set-up at each of the times h, te, - - - , tm. Let the cost of a set-up be h and 
let the number of set-ups avoided by the production of n units at t) be m’ 
(=m or m—1). Then, the present value of the costs avoided by producing an 
over run of 7 units is 


h > em Ft, (7) 


Notice that the set-up at ¢ and the cost to manufacture the n»+n units under 
consideration are ignored, since these costs will be incurred under either al- 
ternative. The foregone return on the funds tied up by producing an over run 
is included in the carrying cost. 

The expression (7) less the expression (6) is the gain or loss on producing an 
over run of n units at tf) by comparison with producing the n units as orders 
are received for them. Combining the two expressions we have 


G=h > e~*ti — & 9 ni[1 — e-*#] 
t=1 t=) (8) 


= h > ec ane Ld n op La - nie~**s, 
inl k k ina 


This expression is to be averaged over the times and sizes of sale in the manner 
described in Section III. If there is no value of n for which the average, or EG, 
is positive, a policy of producing only as orders are received is indicated. Con- 
versely, if there is some value of n for which EG is positive, the production of 
n units, where n is any such value, is more profitable or less costly than. pro- 
ducing as sales are made. 

b. Frequency of Sales 


We assume that sales are randomly spaced but with an average frequency 
of vy sales per unit time. The intervals between successive sales will be denoted 
by 


r= ts — te; #=1,2,-+-,m; to = 0. (9) 
The distribution law for the 7,’s will be required, and it can be shown [4] that 
W(r)dr, = e**%vdr, (10) 
is the probability that the interval 7; will be between 7; and 7;+d7;. 
c. The Size and Number of Sales 


It will be assumed that there is no correlation between the time, t;, that a sale 
is made and its size, n;. 








630 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


The sizes of the sales, n;, will, in general, vary within a given sequence, 
Ni, * * *, Nm, and also from one such run to another. Since the n;’s vary from 
one run to another it may be expected that the number, m, of sales necessary 
to exhaust the stock will also vary. A complete treatment of the problem re- 
quires not only a knowledge of the distribution function for the n,’s (sizes of 
sales), but in addition the distribution function for m (number of sales). In 
principle, the distribution function for m can be computed if that for the n,’s is 
given (the reverse procedure is not possible, in general), but there are practical 
computational difficulties connected with this approach. We will, therefore, 
complete our model by specifying on an intuitive basis a random process for 
determining the n,’s in any given run which leads in a simple way to both of 
the distribution functions mentioned above. The resulting distribution func- 
tion for the n,’s is of a form which seems plausible in many applications, and 
contains one parameter which allows the size of the average sale to be varied 
at will. 

Consider a sequence of Bernoulli trials with a the probability of success and 
(1—a) the probability of failure. We assume the m, the size of the first order, 
is given by the number of trials up to and including the first success, and, in 
general, following the (i—1)th success or order, the number of trials up to and 
including the ith success is n;. The distribution function for n; is, therefore, 
the geometric distribution 


W(n) =a(l—a)*, ng, 21. (11) 


This distribution is applicable in those cases where the frequency of an order 
size falls as the order size is increased, and the distribution is a satisfactory 
approximation for the purposes of the problem where the empirical distribution 
has little central tendency. 

It is easily shown that 1/a is the average size of an order, and the average size 
of the orders expected to be received is probably the best basis for the estimation of a. 

It is important to realize that Eq. (11) gives the distribution function for n;, 
and 1/a is the average size of a sale, only when no special! restrictions are placed 
on the allowed values of n,;. If we consider the set of all possible runs for which 
both n and m have definite assigned values, then from Eq. (3) the largest of 
the numbers ™, 2, - ++, %m, cannot exceed n—(m—1), and the mean value 
of n; for all such runs is 


1 <= n 
E(nj; n,m) = —Sn=(*)i5- 1,2,++-,™. (12) 
m 


M sant 


In the subsequent analysis the detailed form of the distribution of the n,’s for 
all runs with given n and m will not be required. It will be sufficient to know 
the mean values as given by Eq. (12). 

Turning now to the number of orders, m, required to exhaust a lot of n units, 
we first note that the probability of » successes in (n—1) trials is given by the 
binomial] distribution 


PGin= 3 (” ‘ Jor ae ie Pe ye 
m 





LOT SIZE PRODUCTION 631 


In terms of our model » successes in (n—1) trials means that (n—1) units or 
less are sold in yp sales, and the (u+1)th order is equal to or greater than the 
remainder of the n units. Therefore, m= +1, and the probability that m orders 
will exhaust a lot of n units is simply the probability of » successes in (n—1) 
trials, or 


Pim; 0) « C ') aHbL—a,  lsmsn. (14) 


This result can also be obtained by more advanced methods of the theory of 
random variables, or random flights. These would involve finding first by 
standard methods the probability distribution for the total units in (m—1) 
sales, followed by a subsequent summation to obtain P(m; n). However, in 
view of the simplicity of the preceding argument for our special model, the more 
general discussion is not required. 


III. SOLUTION OF THE PROBLEM 


In this section the solution to the problem stated in Section I is obtained for 
the mode) described in Section II. We first find the average gain or loss on the 
production of an over run of n units at t, by comparison with the alternative 
of producing the units as orders are received. We then establish the condition 
under which there is a gain, #.e., the condition for iot size production. Finally, 
a simplified form of the condition is found for important special cases. 


a. The Average Gain or Loss 


Consider a particular over run of n units that will be sold in m sales at times 
th, te, -- >, ts, «+ +, tm. The discounted value of the set-up costs avoided is 


h>> e-**, (15) 
i=l 


Now represent ¢; in terms of the intervals between sales, the r,’s of Eq. (9); we 
have 


é 
t= D1, (16) 
j=l 


enki = []"e-*7. (17) 


j=l 


The 7,’s are independent random variables with the distribution function 
W(r;) of Eq. (10) (the same for each r;). The average of the expression (15) over 
all possible times of sale, each with its proper weight, is therefore 


h > If W(r,)e-**idr; = h } Il et) tiydr ; 
0 


tml jel tl jal“ 0 


m’ yp i 
aS a eee | 
¥(; + -) 





632 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


This calculation is similar to the method of characteristic functions, except 
that we are dealing with Laplace rather than Fourier transforms of the dis- 
tribution functions. That is, our result is equivalent to the statement that the 
Laplace transform of the distribution function for a sum of independent random 
variables is equal to the product of the Laplace transforms of the distribution 
functions for the individual variables. 

On making use of the familiar formula 


a= a(-—), a’* <1, (19) 


l-a 


which is applicable here since k>0 and 0S» ~, the expression (18) reduces to 


P-Gy] ° 


This is the average over al] times of sale of the discounted value of the set-up 
costs avoided by producing n units to be sold in m’ sales. 

We next average Eq. (20) over the possible values of m’, the number of 
set-ups. As pointed out preceding Eq. (7), m’=m if the mth sale exactly ex- 
hausts the residual stock, and m’=m-—1 if the mth sale exceeds the residual 
stock. The probabilities of these two events are clearly a and (1—a), respec- 
tively. Thus, the required average of Eq. (20) is 


“GL- Ga) ]+¢--@)b- Gay] 


GN Gee at 


As our next step we average the expression (21) over all values of m (number 
of sales) from 1 to n, with the weighting of the distribution function P(m; n) of 
Eq. (14). The result will be denoted by Z(H; n), and it is the average over all 
times and numbers of sales of the set-up costs avoided by producing n units. 
We have 


E(H; n) = (=) > (’ = :) oi - or 


k mal m—1 


‘i H=C- ) Ewer (; ; ) ¢ e a (22) 
-(S)ev0=a-(Q- MGS) ee-—7 


or finally 
hv ak \" 
B(H;n) = 2[1-(1- y |. 
k k+y 











LOT SIZE PRODUCTION 633 


The average of the last term on the right-hand side of Eq. (8), the reduction 
in the carrying costs because the sales are made in finite time, is obtained in a 
similar manner. The reduction in the cost for a particular run of n units that 
will be sold in m sales of sizes m, m2, +--+, + *~+, Mm, at times hh, fe, ---, 
ti, «++, tm, is 


(= ) > nye~**s, (24) 


The average of the expression over all possible times of sale, each with its 
proper weight, is 


( 4 ) En O00 Sia Ss (<) x. (—) (25) 


jis 0 kJ Nk +] 


Next, average over all possible sizes of the sales n; with n and m fixed. On 
using Eq. (12) of Section II this yields 


(i) (a) Ele) on 


The expression may be reduced, by means of Eq. (19), to 


mel (res) | en 


This is the average over all times and sizes of sales of the expression (24) for n 
units sold in m sales. 

The average over all times, sizes, and numbers of sales of the expression (27) 
is, 


ewe Gs) (Ge 
EQ )ea-o-- EC) a-er] ew 
E ve (1 " =)’. 


b. The Condition for Lot Size Production 


By substituting the expressions (23) and (28) for the first and last terms on 
the right-hand side of Eq. (8) we obtain the average gain on an over run of n 
units at tp) by comparison with the alternative of producing the n units as they 


are sold. It is 
ak \" B By ( ak y] 
- —a+—i1~-{1-—-——} |. 29 
=) | Stal 4+ (29) 














634 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


Making the substitution 
(30) 


and rearranging terms 


y *) 8 
wo 3( 2) i PS wo a. 31 
(=)( rae éisir t 1) 


The condition under which it is profitable to produce a lot of n units, by 
reference to the alternative of producing to order, is that EG be positive. It is 
evident on inspection of Eq. (31) that ZG is concave in n, with EG =0 for n=0. 
Hence, the expression is somewhere positive if and only if EG’ >0 for n=0. We 
have 
“) A" In A — 4 . (32) 

k k 


a 
The condition for lot production (n positive) is therefore 


B 


EG’ = -~(h+4)inx-= >0, n= 0. 
k ak k 


or that 


h 1 1 | (34) 

B ra Gere ») okt’ : 
That is, for lot production to be profitable, the ratio of the set-up cost, h, to 
the cost of carrying a unit item for unit time, or 8, must exceed the quantity 
on the right side of (34). This condition may be further simplified in special 
cases, but before considering these cases it may be advisable to elaborate on 
why (34) provides the condition for p: oducing an over run while the value of n 
that maximizes EG does not provide uhe optimum lot size. 

If the condition (34) is not satisfied, there is absolutely no lot that is more 
profitable or less costly than producing the item as orders are received, since 
we have placed no restrictions on the possible values of n. If there is some in- 
terval of n over which EG is positive, the value of n that maximizes EG is the 
most profitable single lot the firm can produce. However, some fracticn of this 
lot size may yield a larger profit per unit time, and this is the quantity a firm 
will seek to maximize. Hence the model only establishes whether it is advisable 
to produce for order or for stock. 


c. Special Cases 


When sales are very frequent relative to the required rate of return on in- 
vestment, so that k/» is very small, then, since a is less than or equal to unity 


k 
"€1 (35) 
k+yp 








LOT SIZE PRODUCTION 


ak ak 
ind = In| 1 - |-- . 
k+yp k+p 


In this case, the condition reduces to 
h 1 


-_->-— 
B av 


and 


hy > Bax. (37) 


The condition for lot production given by (37) is that the product of the set-up 
cost and the average frequency of sale exceeds the product of the carrying cost 
and the average size of sale. 

When the average frequency of sale is large and the size of sale is eoual to 
one or on the average small, the condition (37) will fail to be satisfied only 
when the set-up cost is an extremely small fraction of the carrying cost. In this 
case, then, the condition for lot size production will ordinarily be satisfied. 
However, as the average order size is increased, the required value for the 
ratio h/8 is correspondingly increased. 

When the average size of sale is considerably greater than one, as is fre- 
quently the case for wholesalers and manufactwrers, a, its reciprocal, will be 
very small, and provided k/yv is less than unity, as will usually be the case, the 
Eqs. (35) to (37) also hold. In situations described by this case the condition 
for lot size production will fail to be satisfied quite frequently. To illustrate the 
case numerically, let 

c= $4.00=the cost to manufacture a unit of the item 

k= .20=the required rate of return on investment 

8’= $ .50=the cost to store a unit of the item for one period 

B= $1.30=ck+ ’ =the cost of carrying a unit for one period 

ni;= 300 =the average size of a sale 

h=$25.00 =the set-up cost 

v= 10 =the average frequency of sales per period 
Since 


(25)(10) < (1.30)(300) 


it is more profitable to produce this item to order than to produce it for stock. 

Another case of interest is that for which sales are of unit size (or the average 
size is small), and the frequency of sales is small (expensive equipment). The 
approximation (37) is then not valid, and the expression (34) should be 
evaluated. 


IV. CONCLUSION 


We have presented a model which establishes when a firm should produce to 
order rather than produce for stock. The model assumes certain distribution 
functions for the frequency and size of sales, and it ignores the safety level 








636 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


problem. The sensitiveness of the solution to the distribution functions used 
and the applicability of these functions to particular situations remain to be 
determined, but a priori considerations support the relevancy of the assump- 
tions for cases of practical interest. Disregarding the safety level problem is 
clearly of no concern when there is none, 7.e., when the production time is short 
or when immediate delivery is not expected in the trade. 

When ability to provide immediate delivery is desired, a firm may have a 
reorder point above zero under either of the alternatives of producing for stock 
or producing when an order is received. However, the validity of the model for 
selecting between the policies is impaired to the extent that protection against 
running out varies with the lot size as well as the reorder point. The Wilson lot 
size formula and the Eisenhart reorder point formula have been combined by 
Whitin [6] for the purpose of simultaneously determining the optimum lot 
size and reorder point. Calculations by James C. Emery [3] showed that for 
the parameters used, which were reasonable but limited in their range, the lot 
size found by this model is not materially different from that obtained from 
the Wilson formula alone. The simultaneous determination of the lot size and 
the reorder point when demand is uncertain has been investigated on a very 
general level and under simplifying assumptions which limit the applicability 
of the results by K. Arrow, T. Harris, and J. Marschak [2]. 


REFERENCES 


[1] Allen, R. G. D., Mathematical Analysis for Economists. London, 1949, 228-34. 

[2] Arrow, K., Harris, T., and Marschak, J., “Optimal inventory policy,” Econometrica, 
19 (1951), 250-72. 

[3] Emery, J. C., Some Theoretical Aspects of Inventory Control, Master’s Thesis in Indus- 
trial Management, M.I.T., 1954. 

[4] Feller, W., Probability Theory and Its Applications, Vol. I. New York, 1950, 220-1. 

[5] Schneider, E., Wirtschafilichkeitsrechnung. Bern, 1951, 120-3. 

[6] Whitin, T. M., The Theory of Inventory Management. Princeton, 1953. 








DISTRIBUTIONS POSSESSING A MONOTONE 
LIKELIHOOD RATIO 


Samvue.t Karin 
Stanford University 


AND 


H. Rosin 
University of Oregon 
A sufficient condition for the validity of the ¢ umonly used one-sided 
tests is that the underlying density p(z, w), wherv z is the observed vari- 
able and w the unknown parameter, possess a “monotone likelihood ra- 


tio.” In estimation, only monotone functions of the observed variable z 
should be used as possible estimating functions of w. 


» INTRODUCTION 


ANY classical procedures for testing statistical hypotheses on the basis “_ 4 
M set of observations can be summarized as follows: 

a) We calculate from the observations the value of some statistic, e.g., a 
chi-square, an F ratio, or a sample average. 

b) We compare the value thus calculated with some pre-assigned critical 
value, rejecting the hypothesis if the test statistic comes out larger than the 
critical value, and accepting the hypothesis in the contrary case. 

In this paper we shall be concerned only with the case in which the test 
statistic has the property of sufficiency [2, pp. 208-230] which roughly means 
that we can control our risk of error in the test just as well by the proper use 
of the single test statistic as we could if we used the detailed data provided by 
the individual observations. This restriction enables us to carry on the discus- 
sion as though we were dealing only with a single observation, although the 
reader should keep in mind that our results apply to the case in which the value 
of this “observation” may actually represent the result of combining a large 
number of direct experimental results. For example, if we assume that we are 
dealing with n independent observations from a binomial distribution in which 
the probability of success is p and wish to test the hypothesis p< p, against the 
alternative p> po, it is well-known that we may as well use a test that depends 
only on the number z of successes observed in the n trials, instead of taking 
into account the order in which the successes occurred. The standard test is 
then of the form: if z turns out to be greater than some pre-assigned number c, 
reject the hypothesis that pS pp. 

Other exampies of this type are familiar to every statistician. Thus it is im- 
portant to determine the circumstances under which such procedures, which 
seems intuitively reasonable, can be theoretically justified. For instance, it is 
known that the test just described for the binomial distribution is uniformly 
most powerful among tests of level of significance a. This last result is a well- 
known isolated fact. It is the purpose of this exposition to elaborate on such 
one-sided tests and their relationship to statistical decision theory. 


637 

















638 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


The general theory of statistical decision-making as formulated by Wald has 
made it possible to set up a general framework within which one can evaluate 
the worth of any statistical procedure. For example, the theory of testing hy- 
potheses as developed by Neyman and Pearson and the theory of estimation 
of parameters by means of confidence intervals emerge as special cases of the 
general theory of Wald. Many other basic notions which have been introduced 
to aid the study of statistical problems, such as unbiasedness and the concept 
of uniformly most powerful tests, can also be analyzed by means of this unified 
theory. Thus, it is clear that the generality of Wald’s decision theory is tre- 
mendous. But what seems still te be lacking are further specific applications 
and uses for this theory. The gap between the general theory and the applica- 
tion of this theory to specific problems is far too wide and it is therefore our 
aim to try to fill this gap to a certain extent. 


1. Distributions with a monotone likelihood ratio 

The concept of the likelihood ratio has long been used by statisticians as a 
criterion for making decisions. Let us begin by defining what we mean by a 
likelihood ratio. If the outcome of some chance experiment is described by a 
random variable x whose density function has a known form p(z, w) which is 
specified by an unknown parameter w, then following the terminology of 
Fisher we can call p(z, w) the likelihood function. If «, and w. are two possible 
values of the unknown parameter, 


p(x, wr) 

p(z, we) 
is called the likelihood ratio. It is a well-known fact that if the parameter w is 
known to be either « or w2, then the best possible procedure for deciding which 
is the correct value of w is to observe the outcome z of the chance experiment 
and if the likelihood ratio is larger than some critical value decide that w; is the 
correct value. 

We will say that the density function possesses a monotone likelihood ratio if 

p(x, wi) 

p(x, we) 
is a non-decreasing function of z whenever w;>w:. This implies that if 2; >22, 
then 

p(x, w) 

p(2e, w) 
is a non-decreasing function of w. It is conceivable that for some values of x 


and w, p(x, w)=0. Hence we may avoid this complication by defining the 
density to have a monotone likelihood ratio if whenever, 2: >22 and w; > ws, 


P(xi; on) p(x, w2) — pri, w2)p(r2, wn) 2 0. (1) 


This is seen to be equivalent to the above definition if the densities are strictly 
positive. 











MONOTONE LIKELIHOOD RATIO 639 


Throughout this paper we shall be concerned only with distributions which 
obey inequality (1). The most noteworthy class of such distributions consists 
of the exponential family of distributions, i.e., distributions whose density can 
be represented as 


p(x| w) = B(w)e™f(x). 
For these densities, 


p(x | wn) p(x2 | we) _ p(x | ws) p(2s | w) 
= B(a1)B(w2) [e(1- 22) oro) — 1] f(x,)f(aa)emortarmn 


which is positive if 2; >22 and a >». 

The exponential family of distributions includes such densities as the normal 
with prescribed variance, the binomial, the Poisson, and the Gamma [2, p. 179]. 
A more general class of distributions for which (1) is valid is given in a recent 
paper [6]. This class includes as special cases the non-central ¢ and the non- 
central F densities. Other examples of densities with monotone likelihood 
ratios which occur in many practical situations are as follows: 





na" 
0<2t<wa>OD n @ fixed positive integer 
p(x | w) = { @ 
0 elsewhere 
( xz*—*(w — 2) : 
n(n — 1) ————_ _— OD <r <ww>OD nan integer = 2 
p(z|w) = |} a” 
0 elsewhere 
(1 
BEE e7 (1/0) (2—-@) z > @ 
p(x, 0,w) = {0 
0 zSo 





This last distribution is known as the exponential or waiting time distribution 
and occurs in some models of life testing experiments [3]. It possesses a mono- 
tone likelihood ratio with respect to either parameter @ or w. The two previous 
distributions are extreme value distributions arising from the uniform. dis- 
tributions [4, p. 241]. This enumeration of examples strongly indicates that 
many of the usual distributions occurring in statistical applications possess 
monotone likelihood ratios. 

A simple criterion often useful in determining whether a density p(z, w) has a 
monotone likelihood ratio is that 


—~ } 
ia ™ p(x, w) 


should be non-negative. 


2. Testing hypotheses 


A special case of the general theory of decisions arises when only two de- 
cisions are possible. This is, in fact, the situation with which we are faced in 














640 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


testing hypotheses, where a hypothesis is to be either accepted or rejected. 
Thus, if our observed data are distributed according to the density p(z, w) 
where w is the unknown parameter, often referred to as the state of nature, we 
may be interested in deciding if w<w, or #2w, where w, is some fixed number. 
Certain losses are assumed to be suffered in the case of wrong decisions, so we 
will define L,(w) to be the loss if decision 1 is made when w is the true state of 
nature and Le(w) to be te loss when decision 2 is made and w is the true state 
of nature. The classical theory of testing hypotheses corresponds with the 
special case where 


0 if w<m 
1 iff wea 


l if o<w 


a to if w2zwa 


Iu) = | 
A more general formulation of the problem might allow the losses to be arbi- 
trary functions of the state of nature w. 

In order to state the main result for this problem, we must introduce the 
concept of a complete class of tests. A test is simply a rule which tells for each 
observed value z, whether to make decision 1 or decision 2. Any collection of 
such decision rules will be referred to as a class of tests. The risk associated 
with any test is defined to be the expected loss incurred if that test, i.e., de- 
cision rule, is used. In our particular problem the risk is merely the probability 
that the test will decide that w>w, when actually wa, or else it is the prob- 
ability that the test will decide that w=, when actually w>w,. Let us indicate 
a test by T and a class of tests by {7.}. The class {7'.} of tests is called com- 
plete if for any test 7 which does not belong to the class, there is a test 7; in 
the class whose risk is no larger than the risk for the test T for every value of w. 
If a given class of tests is complete, then the statistician need not examine any 
other tests outside the class in his search for a “best” test. There may, however, 
be many complete classes of tests. The class of all conceivable tests is, of course, 
complete; but, what is really wanted is the smallest possible complete class— 
the minimal complete class. Once a complete class or minimal complete class 
is specified, further requirements are usually imposed in order to select a specific 
test from the class. The most common requirement is that of specifying a 
certain significance level. 

The fundamental result for the problem stated above is that the class of 
“monotone tests” is a minimal complete class when the underlying distributions 
have a monotone likelihood ratio. A monotone test is described as follows: 
Choose a critical value z, and accept the hypothesis if z>z, but reject it if 
a <z,. Each value of x, determines a member of the class { T. }. For any given 
test 7 which is not monotone, there exists a unique monotone test 7, whose risk 
is strictly less than the risk of T for every state of nature w except w=w,. (The 
monotone test 7; is determined so that the risk for T is the same as the risk 
for T; at w=w».) 

Although the monotone tests are intuitive and have arisen in some of the 
classical testing problems, the foregoing result shows just what class of distri- 
butions (those with monotone likelihood ratios) justifies the use of monotone 
procedures. For example, if the underlying density is c[cosh (x—w) ]- and it is 








MONOTONE LIKELIHOOD RATIO 641 


desired to test whether #20 against the alternative w<0, then the optimal 
tests are of the form: accept the hypothesis if z>z, and reject the hypothesis 
otherwise. This density occurs in connection with the study of Brownian 
motion and has a monotone likelihood ratio. In contrast, let the underlying 
density be the Cauchy density for which 


1 1 
x 1+(2#-—w)? 


and consider as above the one-sided testing problem. For this problem it can 
be shown that there exist statistical procedures 7’, non-monotone, which cannot 
be uniformly improved upon in terms of risk by any monotone test. The 
explanation, of course, is that the Cauchy density does not possess a monotone 
likelihood ratio. To substantiate these assertions, let us consider testing the 
simple hypothesis with density given by 


p(z, w) = (2) 


1 1 
* (1 Ml lr perere 3 
m= Ss (3) 
against the simple alternative given by 
fi(z) (4) 
nL aed le { 


where the a priori probability that the hypothesis is true is known to be = 17/54. 
The classical likelihood ratio test reduces to the following: the hypothesis is 
accepted if 0.4525 and rejected otherwise. Consequently, if we consider a 
one-sided composite testing problem for the Cauchy density as described previ- 
ously, then it follows that the test in which action 2 is taken if 0.4<2<5 and 
action 1 is taken in the contrary case is evidently not a monotone test and can- 
not be improved upon by monotone tests. 

A distribution with a monotone likelihood ratio reflects an order-preserving 
relationship between the observation and the parameter. More precisely, even 
in the Cauchy distribution we see that “if w is small, z will probably be small.” 
But it requires something more, essentially the monotone character of the 
likelihood ratio test to enable us to go from the statement “if w is small, z will 
probably be small” to the decision rule “if z is small, assume that w is small.” 

The classical one-sided t-test for the parameter u/o(u=mean, o?= variance) 
and the classical chi-square test for o? where the observations arise from a 
normal population N(u, o?) are monotone tests where the class of tests has 
been restricted to those which are functions of #/s and of s*, respectively, where 
s? is the sample variance. The restriction to such statistics can be justified by 
limiting ourselves to unbiased tests or to tests which are independent of the 
scale of measurement. The standard F-test for the ratio of two variances also 
reduces to a monotone test where the underlying density is the non-central 
F distribution. 

Finally, another basic property possessed by distributions which have a 
monotone likelihood ratio is the monotonic nature of the power functions for 





642 ; AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


monotone tests. From this it readily follows that in any one-sided test of a 
hypothesis, the one-sided critical region for a prescribed error of type one is a 
uniformly most powerful test [7]. 


4. Three-action case 


To illustrate further some of the statistical properties of a distribution 
possessing a monotone likelihood ratio, consider the following three-action 
problem: It is desired to classify a product into three categories on the basis 
of the mean of a sample from a population having a distribution N(u, 1). If 
the true mean, y, is not less than y,, we desire to label the product A (best 
quality). If u2<u<j, then the product should be labelled B (medium quality). 
In the case where u Sue, the product should be labelled C (inferior quality). On 
the basis of the sample tested, and by means of the statistic , we wish to 
classify the product as A, B, or C. The losses due to an incorrect decision when 
the true state is given by u are 


0 mse 1 mSu 
Inu) = 31 mecca Tou) = 30 w<u<m 
2 p<pe 1 w= pe 
2 msn 
Tsu) = 41 we<u<m 
0 pm 


In other words, the penalty increases with the degree of error. It is more costly 
to classify a product into group C than into group B when the true state is A. 
Again, a minimal complete class of procedures can be characterized as follows: 
choose two critical values z, and x (r_.>») and label the product A if >zz, 
B if rz. >>, and C if <2. Statistical procedures of this type for this prob- 
lem are again referred to as “monotone procedures.” The critical values can 
usually be chosen to satisfy any two prescribed constraints. Finally, here as 
before, constructive methods can be given for determining a monotone test 
whose risk is everywhere smaller than the risk of any specified non-monotone 
procedure. 

This example is typical of a class of n-action statistical problems for which the 
losses corresponding to the various actions are monotone in an appropriate 
sense and the fundamental distributions have a monotone likelihood ratio. 
Complete classes of statistical strategies can be fully determined for such gen- 
eral situations. 


5. Estimation 


Finally, we shall indicate some results for estimation problems in which the 
underlying distributions possess a monotone likelihood ratio. Suppose for ex- 
ample, we are interested in estimating the parameter w with the loss function 
(t—w)* where ¢ is an estimate, i.e., some function of the observations. An essen- 
tially complete class of estimators is the class of all monotone functions of a 
sufficient statistic. In selecting an estimator from among the class of all mono- 





MONOTONE LIKELIHOOD RATIO 643 


tone functions of a sufficient statistic, some other requirement must be imposed. 
For example, to estimate the mean of N(y, 1) with the square of the error as 
the loss function, a minimax criterion leads to the particular monotone estimate 
t(x) =. 

A related situation is that in which we wish to estimate w if |w—w,| 2, but 
wish to take an alternate action if | o—wo| <e. Then the complete class of pro- 
cedures has the form: there exist two critical values 2, and 22 such that for 
x<2, a monotone function ¢(x) for which t(z)Sw,.—e is used to estimate w, 
if >a: we should use a monotone function ¢ such that t(r)2w,.+.«, and 
if x; <x<a_ we should take the alternative action. 


REFERENCES 


{1} Allen, 8. G., Jr., “A class of minimax tests for one-sided composite hypotheses,” An- 
nals of Mathematical Statistics, 24 (1953), 295-8. 

[2] Blackwell, David, and Girshick, M. A., Theory ef Games and Statistical Decisions. New 
York: John Wiley and Sons, 1954. 

[3] Epstein, Benjamin, and Sobel, Milton, “Life Testing,” Journal of the American Statis- 
tical Association, 48 (1953), 486-502. 

[4] Hoel, Paul, “Introduction to Mathematical Statistics,” Second Edition. New York: John 
Wiley and Sons, 1954. 

[5] Karlin, Samuel, and Rubin, Herman, “The theory of decision procedures for distribu- 
tions with monotone likelihood ratio,” Annals of Mathematical Statistics, 27 (1956), 
272-300. 

[6] Karlin, Samuel, “Decision theory of Polya type distributions. Case of two actions I,” 
to appear in Proc. Third Berkeley Symposium on Probability and Statistics, vol. 1. 

[7] Karlin, Samuel, “Decision theory of Polya type distributions II,” submitted to the 
Annals of Mathematical Statistics. 

[8] Lehmann, E. L., “Ordered families of distributions,” Annals of Mathematical Statistics, 
26 (1955), 399-419. 

[9] Sobel, Milton, “An essentially complete class of decision functions for certain stand- 
ard sequential problems,” Annals of Mathematical Statistics, 24 (1953), 319-37. 





QUADRATIC EXTRAPOLATION AND A RELATED 
TEST OF HYPOTHESES* 


A. pB uA GARZA 
Union Carbide Nuclear Company, Oak Ridge, Tennessee 


This paper discusses a problem in the spacing of observations in 
quadratic regression for most precise extrapolation. Let E(y;) =a+ 62; 
+yz. The y; are uncorrelated observations with variance V,; the z; are 
controlled. The problem answered is: at which 2z-values in a specified 
interval [zz, zy] should N observations y;, i=1(1)N, be taken so that 
for a § >a” (or §<zz) the least squares estimate of a+ 8t+~7# has mini- 
mum variance? It is shown that the observations are located at zz, 
(2, +2”) /2, and zg. The distribution at these locations as a function of 
& is given. These results are also useful in designing experiments for 
testing the hypothesis of a quadratic relation versus that of a linear 
relation. 


1. INTRODUCTION AND SUMMARY 


ANIEL and Heerema [2] have discussed problems in the spacing of observa- 
D tions in linear regression for most precise linear extrapolation and slope 
estimation. In this paper we extend some of their results to quadratic regres- 
sion. We make use of Elfving’s work [4] on optimum allocation in linear re- 
gression theory. Due to the simple treatment possible in the particular case of 
quadratic regression, we re-derive some results which are special cases in 
Elfving’s work. 

Let 


E(y;) = a + Bay + yx. (1) 


The y; are uncorrelated observations with expectation E(y,;) and variance V,. 
The z; are controlled variates without error. The parameters a, 8, and y are 
unknown. Suppose we are permitted N observation points (z;, y;), i=1(1)N, 
restricted by x, 2;S2y. In this situation we consider the following problem 
in quadratic extrapolation: At which z-values in the specified interval [xz, rz] 
should we take the permitted N observations y;, i7=1(1)N, so that fcr a given 
t>ay we minimize the variance V[Y()] of Y(é), the least squares estimate 
of a+ $§+-yé?? (Instead of a §>2zy, we could have specified §<7,.) 

The motivation for the problem is clear; we desire a procedure for quadratic 
extrapolation which is optimum in the sense of minimum variance. For this 
purpose we show that the N observations y;, 1=1(1)N, are always taken at 
rz, (XL +2n)/2, and zy; the distribution of the N observations at these three 
locations depends on the extrapolation point ¢ as shown in Table I. For ¢ remote 
from the interval [z:, zz], the distribution approaches N/4 observations at 
zz, N/2 at (x1 +22”)/2, and N/4 at zg. We also show that this limiting distribu- 
tion is that which minimizes the variance of the least squares estimate of y, the 
quadratic coefficient. Hence, this distribution may be used to advantage in 
designing experiments for testing the hypothesis of a quadratic relation versus 
that of a linear relation. 

* Work done under AEC Contract No. W-7405-eng.-26. 


644 








QUADRATIC EXTRAPOLATION AND A RELATED TEST 645 


Before we present our results, it is to be emphasized that an extrapolation 
made without sure knowledge of the underlying function is always risky. This 
paper is not meant to encourage such procedures. Rather, the intent is to show 
how even at their best, in the relatively simple case of a quadratic, extrapola- 
tion problems must be handled with care. 

2. SOLUTION OF THE QUADRATIC EXTRAPOLATION PROBLEM 


We consider the quadratic extrapolation problem stated in the Introduction. 
It is known that the least squares estimate of a+8§+~é? is 


Y(é) = a + bé + c€?, (2) 


where a, b, and ¢ are the least squares estimates of a, 8, and y, respectively. 
The variance of Y(é) is 


V[Y(@®)] = (1 & &)M-X1 € &)'V,, (3) 
where M is the square matrix of the normal equations. Thus 
N Ya Dz 
M=|dSx% Dz? Dize (4) 
2,38. 288 2398) 


the summation index being i=1(1) N. The quadratic extrapolation problem 
reduces to minimizing (3) subject to satisfying 7, S2;S 2g in (4) 
We have the following direct result from [4] which simplifies our extrapola- 


tion problem. Let 
E(ys) = 218i + 2282 + raiBs. 


The y; are uncorrelated observations with expectation E(y;) and variance V,. 
The z’s are controlled variates without error, and the #’s are the unknown 
parameters. Let a source of information be described by the values of the con- 
trolled variates x, x2, x3 at which a y-observation may be taken. Elfving has 
shown in [4] that among the sources of information at the experimenter’s 
disposal, there are in general three sources which are relevant to minimizing 
the variance of the estimate of a single linear combination of the parameters. 
(The number of observations at each source is then taken to be a continuous 
quantity.) 

We are thus assured that for our extrapolation problem it suffices to dis- 
tribute the N observations at only three distinct z-locations in the specified 
interval [xz, ra]. Let these be X;, X2, and Xs, and let us agree that X;<X2<X3. 
Let n; observations be taken at X, so that }°nj=N, j=1(1)3. Let 9; be the 
arithmetic mean of the n; observations at X;. We may then verify that a+bz 
+cx*, the least squares estimate of a+f8z+~yz’, passes through the points 
(g;, X;). Hence, Y(&) may be written in the Lagrange form 


G-tG-2) . @enoeke | 
=~)"  G~koe~ks” 
(€— Xi)(E—X) 
ih ~ ioe 


Y(é) = 





(5) 








646 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 
which will be written 


Y(t) = Do Fysi- (6) 
We then have 
3 
V[Y(@)]/V. = & F3/n. (7) 
i 


For a given X;, Xo, and X;3, we now find the n,; which minimize (7) constrained 
by >on;=N and n;>0, j=1(1)3. Using the method of Lagrange multipliers, 
we find that 


maw Fi/D (Pl, (8) 

and in this case, 
VIY@)/V. = ( x | P| ) iN. (9) 
We now proceed to find the X; that minimize (9). We consider the case 


t>aq. Let z;=§—X,;. We then have z;>0 and z;<2z2.<z. From (5) and (6) we 
see that >>| F;| may be written 





: 2223 2123 2122 
F; = ’ 
u | | (a — 22) (a - 2s) * (a — 22) (22 re" 2s) = (a eas 23) (Ze = 2s) 


which may be reduced to 
B 
se | F,| = 1 + 2z29/(z: — 22)(z2 — 2s). (10) 
1 


We see that to minimize (10) and hence (9), we make z; as small as permissible, 
z, as large as permissible, and z:=(z:+2;)/2. This means that 
Xi = 2p, GX: = (xz + 2z)/2, Xs = zu, (11) 


a conclusion also holding for the case §<z,. Thus, to minimize (7), n; is de- 
termined from (8) with X; given by (11). This concludes the solution to the 
considered quadratic extrapolation problem. 


3. DISCUSSION OF QUADRATIC EXTRAPOLATION 


With the formulas given in section 2, we may construct the following table 
which shows how N y-observations are distributed at 21, (x,+2)/2, and zz 
for §£>zy. For §<z,z, the entries under z, and zg are interchanged. 





QUADRATIC EXTRAPOLATION AND A RELATED TEST 647 


TABLE I 
VN times the 
Relative Deviation Fraction of N Observations Relative Standard 
of Extrapolation Taken at Error of Extra- 
Point polated Value 
D() i (xx+2n)/2 tH VN R(t) 
1.00 0.000 0.000 1.060 1.00 
1.25 0.074 0.265 0.661 2.12 
1.50 0.107 0.357 0.536 3.50 
1.75 0.128 0.402 0.470 5.12 
2.00 0.143 0.428 0.429 7.00 
2.50 0.163 0.457 0.380 11.5 
3.00 0.176 0.470 0.354 17.0 
4.00 0.194 0.484 0.322 31.0 
5.00 0.204 0.490 0.306 49.0 
cs 0.250 0.500 0.250 oo 


D(&) = [& —4(xe +22) ]/4 (an —2:2). 
R(t) = { V[¥(e)]/Vo}?. 


An example will make the use of this table clear. Suppose we want to extrapo- 
late to £=z297+(xy—21)/2. We wish to know how many y-observations are 
required and how they should be distributed so that the relative standard error 
of the extrapolated value is one. The table gives the answer as follows: 

For §=2z_g+(zr”—21)/2, we have D(¢) =2. From the table, for D(¢) =2, we 
have \/NR(t)=7. From our precision requirements, R(t) =1. Hence ./N =7, 
and N =49 observations are required. These are distributed as follows: 


at 21: 0.148 xX 49 = 7 
at (x, + 2H)/2: 0.428 K 49 = 21 
at zz: 0.429 K 49 = 21 


This example shows what to expect of extrapolation. First, it is agreed that 
the extrapolation point is not very far away from “data.” The situation is 
similar to taking data in the range r,=1 to rg=2 and extrapolating to §=2.5. 
It is also agreed that requiring the relative standard errer of the extrapolated 
value equal to one is not a particularly strict precision demand. In quadratic 
interpolation, a relative standard error at most equal to one in the entire in- 
terval [zz, zz] can always be attained with only three observations. Our 
extrapolation is seen to require 49 observations, a rather large increase. This 
example is not an argument against extrapolation; quite often, extrapolation is 
unavoidable. The example is intended to show that extrapolation problems 
even at their best must be handled with care. Simple calculations like those 
exemplified here should be made to find out beforehand what may be expected 
from a proposed extrapolation. 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


4. HYPOTHESIS OF A QUADRATIC RELATION VERSUS 
HYPOTHESIS OF A LINEAR RELATION 


We see from (3) that for large lel, V[Y(é)] approaches its dominant term 
mV .t*, where m** is the (33) element of M-'. It is known that mV, is V(c), 
the variance of the least squares estimate c of y. Hence, for large | ¢|, when we 
minimize V[Y(é)], we minimize V(c). We may see from Table I and we may 
verify from (8) that as | ¢| +, the limiting distribution of the N y-observations 
is N/4 observations at x,, N/2 at («,-|-rq)/2, and N/4 at zg. This is the dis- 
tribution that minimizes V(c), and clearly, it may be used to advantage in 
designing experiments for testing the hypothesis of a quadratic relation, 
E(y:) =a+fz;+7y2, versus the hypothesis of a linear relation, E(y;) =a+ 62x. 

We may verify by evaluating m* for the limiting distribution that 


min V(c) = 64 V./N(2a — 21)‘. 
When N is not divisible by 4, min V(c) can only be approximated. This may 
be done by distributing the N observations as shown in Table II. 
TABLE II 
Number of Observations at 


LL (tr +2”) /2 TH 


N=4p+1 p 2p+1 p 

N =4p+2 pt+l1 2p pti 
N =4p+3 p+ 2p+1 pt+l 
p=1, 2,--- 


The ratio of min V(c) to V(c) given by the Table II distributions is given in 
Table III for p=1(1)5. 


TABLE III 
min V(c)/V(c) given by 
N =4p+1 N =4p+2 N =4p+3 


p=l - 9600 -8888 -9796 
2 .9876 - 9600 -9917 
3 .9941 .9796 -9955 
4 . 9965 -9877 .9972 
5 -9977 -9917 -9981 


We may see that with the exception of N =6, little improvement, if any, is 
indicated. 

An observation of P. B. Wood readily explains the above limiting distribu- 
tion. Consider a quadratic y= a+fxr+~z*, and a chord intersecting the quad- 
ratic at x, and x2. We have the following two properties: (1) the difference be- 
tween an ordinate of the chord and the ordinate of the subtended arc at the 
same abscissa is greatest (in absolute value) at (2;-+-22)/2; (2) with one point of 
intersection fixed, say z,; this greatest difference increases as x2 is moved away 
from 2. Hence, it follows that among ull the chords that may be drawn in an 
interval [xz, x1], the greatest difference A corresponds to the chord with inter- 





QUADRATIC EXTRAPOLATION AND A RELATED TEST 649 


section points x, and zy, and A occurs at (r,+2%)/2. The departure of A from 
zero is a measure of departure from linearity. On taking nm, observations at 
Ip, Ne at (x, +2n)/2, and nz at Z_, we may estimate the signed A by 


1 nm n3 n2 
=( > yri/M + > vn/ms) ais > Y2i/Me 
1 1 


with variance 
(1/4m, + 1/403 + 1/12) V.. 


On restricting ni+n2+n3=WN, we see that the variance of our estimate is mini- 
mized by letting n; = N/4, n2=N/2, ns=N/4, which is our previous result. The 
same geometrical idea should be of use in detecting departure from linearity 
given by curves other than the quadratic. 


5. REMARKS ON GENERALIZATION 


In closing, we refer to [1], [3], and [4], which are of interest in extending the 
results of this paper to more general situations. From Elfving’s original work 
[4], one may deduce the allocation of a prescribed number of observations in 
linear regression to minimize the variance of the estimate of a single linear 
combination of the parameters. Elfving further considers optimum allocation 
for the simultaneous estimation of the parameters by minimizing a positive 
definite quadratic form in the diagonal entries of the variance-covariance 
matrix of the estimates of the parameters. Elfving’s work also considers the 
situation where observations are at different costs and the total cost is pre- 
scribed; in this case, an optimum allocation of the total cost is made. Chernoff 
[1] establishes asymptotic criteria and generalizes Elfving’s work to other 
situations than those from linear regression theory; the allocation of a pre- 
scribed total cost is included. In [3] it is shown that in the regression of a poly- 
nomial of degree m, it suffices to allocate observations at only m+1 values of 
the sure variate in a specified interval for the optimization of a criterion involv- 
ing the variance-covariance matrix of the estimates of the polynomial coef- 
ficients. Our quadratic extrapolation problem in section 2 supplies an applica- 
tion of this result. Here, m=2 and allocation of observations at three values of 
the sure variate in the specified interval [z,, zz| suffices to minimize (3), the 
variance of the estimate of the extrapolated value. 


REFERENCES 


{1] Chernoff, H., “Locally optimum designs for estimating parameters,” Annals of Mathe- 
matical Statistics, 24 (1953), 586-602. 

[2] Daniel, C., and Heerema, N., “Design of experiments for most precise slope estimation 
or linear extrapolation,” Journal of the American Statistical Association, 45 (1950), 
546-56. 

[3] de la Garza, A., “Spacing of information in polynomial regression,” Annals of Mathe- 
matical Statistics, 25 (1954), 123-30. 

[4] Elfving, G., “Optimum allocation in linear regression theory,” Annals of Mathematical 
Statistics, 23 (1952), 255-62. 





CORRIGENDA 


Readers and authors are invited to submit corrections to papers published 
in any previous issue. These will be published each year in the December issue. 


Bainbridge, J. R., Grant, Alison M., and Radok, U., TaBuLAR ANALYSIS OF 
FacToriAL EXPERIMENTS AND THE UsE or Puncna Carps, Vol. 51, No. 273 
(March 1956), 149-58. 

Lloyd 8. Nelson has pointed out two errors on page 153. The first item in the 
divisor column should read 3 instead of 4 in table 2(i), and 4 instead of 1 in 
table 2(iii). 


Chernoff, Herman, book review of Kendall, Exercises in Theoretical Statistics, 
Vol. 50, No. 272 (December 1955), 1334-5. 

p. 1335: Problem 39, not 58, is the one to which the reviewer believes the 
answer to be slightly incorrect. 


Crow, Edwin L., GENERALITY OF CONFIDENCE INTERVALS FOR A REG&ESSION 
Function, Vol. 50, No. 271 (September 1955), 850-3. 

It was stated on page 851 that M. S. Bartlett, in a 1933 paper, specifically 
retained the requirement of fixed values of X for the test of the differenc: be- 
tween two regression coefficients for two samples from possibly different niulti- 
variate normal populations. It has been called to the author’s attention that 
Bartlett soon thereafter noted this restriction to be unnecessary in “The prob- 
lem in statistics of testing several variances,” Proceedings of the Cambridge 
Philosophical Society, 30 (1934), 164-9. 


Golub, Abraham, and Grubbs, Frank E., ANA.tysis or Sensitiviry Experi- 
MENTS WHEN THE LEVELS oF StimuLus CANNoT BE CONTROLLED, Vol. 51, 
No. 274 (June 1956), 257-65. 


p. 259, equation (2), should read: 
L = 2 {8; log ps + (1 — 8,) log gs} 


p. 259, equation (4a) should read: 
OL 1 bez t,2, 
App pH 
da o « Qe Dr 

p. 260, equation (7) should read: 

eL 1 t.Zs 2° bZ¢ 

-—|x=-x5-= 
Ou? ae ae @ Qe" r Dr 


p. 260, equation (8) should read: 








z 








mae 








2 
Pp,” 

















LE 1 . beZe teZ_” 2s 
Oude we =A 2 Ys a x q:” vi @ 
iy?2, tz," 2p 
-D-y-r=| 
r Pr r Pr r Pr 


650 





CORRIGENDA 651 


Jerome Cornfield and Nathan Mantel have called attention to the fact that 
these results are to be found in earlier literature, in particular, Cornfield and 
Mantel, Journal of the American Statistical Association 45 (June 1950), 181-210, 
and Finney, Biometrika, 34 (1947), 320-34. 


Gray, Percy G., Taz Memory Factor 1n Soctan Surveys, Vol. 50, No. 270 
(June 1955), 344-63. 

Robert H. Hoskins has pointed out an error in Table 1 on page 346. The 
average number of consultations per person in the period December and 
January should read 0.503, not 0.533. 


Kempthorne, Oscar, Toe RanpomizaTiIon THEORY OF EXPERIMENTAL IN- 
FERENCE, Vol. 50, No. 271 (September 1955), 946-67. 
Julian Stanley has pointed out the following errors: 
p. 950, fourth line from bottom: Delete the second “and”; 
p. 952, second line from bottom; Add “s” to “treatment” and replace “N” 
by “n”; 
p. 954, eighth line: Delete semicolon before “is either”; p. 962, fourth line 
from bottom: Delete comma after “U.” 


Kitagawa, Evelyn M., Components oF A DIFFERENCE BETWEEN Two RaTEs, 
Vol. 50, No. 272 (December 1955), 1168-94. 

Several readers have noticed that on page 1171 in the last display after the 
first pair of summation signs there should be a “t;;.” 


Moore, P. G., Tue Propertizs or THE Mean Square Successive DirFer- 
ENCE IN SAMPLES FROM VARIOUS PopuxatTions, Vol. 50, No. 270 (June 1955), 
434-56. 
p. 437, (2n—1)y2 should read 2(n—1) ys, 
2(8n15)us should read 2(8n —15) ys; 


2 , 2 

. te {2m - + + should read —nef_ént see 
n3(n—1)8 n*(n—1)* 

Noether, Gottfried E., Usr or Tur RANGE INSTEAD oF THE STANDARD DEVIA- 

TION, Vol. 50, No. 272 (December 1955), 1040-55. 

p. 1044: Replace lines 5 and 6 by “This value of G;,' is exactly equal to the 
one-sided .05-significance point given in Table 1 [3], which emphasizes 
the need for more accurate tables than are now available.” 

p. 1047: Third line under table should read .039 instead of .029. 


p. 438 


Raff, Morton S., On ApproxIMATING THE PotnT BrnomiaL, Vol. 51, No. 274 
(June 1956), 293-303. 
p. 299: The right-hand member of equation (10) should read ¢(—y/3+/2) 
instead of $(y/3+/2). 


Rider, Paul R., Toe DistriBuTION oF THE PRopucT or MAximuM VALUES IN 

SAMPLES FROM A ReEcTANGULAR DistrisuTion, Vol. 50, No. 272 (December 
1955), 1142-43. 

p. 1143: In the first line of the last paragraph, “origin” should be replaced 
by “mean.” 








652 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


Weiler, H., Tue Use or Runs to Cowrrot THE MEAN IN Qua.ity ConTROL, 
Vol. 48, No. 264 (December 1958), 816-25. 
Carl Barker has called attention to the following: 
p. 818; Line seven under “Problem” should read “p=0.001,” not “0.01.” 
p. 819; In Table I, values of B, when \=2 should read 1.85, not 1.05 and 
when A=5 should read 0.62, not 0.26. 


Wilk, M. B., and Kempthorne, O., Frxep, Mixep, anp Ranpom MobELs, 
Vol. 50, No. 272 (December 1955), 1144-67. 
Julian Stanley and William I. Martin have pointed out the following errors: 
p. 1149: In last set of equations (ab,;) should be (ab) ;;; 
p. 1150, line eight, first line of first random variable: 1 is omitted after =; 
p. 1153, Table 3: b/A should be a/A; 
p. 1154, Table 4: First E(MS) should have + instead of = after o,?; 


A 
coefficient of third term should be r [; 


p. 1159, Table 5. E(MS) for Anneals: insert a “+” between ¢,? and the fol- 
lowing parenthesis; second term of E(MS) for Coils should have ¢,? added. 


Willcox, Walter F.. Mernops or ApPporTIONING SEATS IN THE HovusE oF 
REPRESENTATIVES, Vol. 49, No. 268 (December 1954), 685-95. 

Robert Hoskins has pointed out that on page 691 the District Populations 
for California were transposed and should have been 341 “Before transfer” 
and 353 “After transfer.” Thus Disparity was 11.7 per cent “Before transfer” 
and 11.0 per cent “After transfer.” 


Yeracaris, Constantine A., DirFERENTIAL Mortauity, GENERAL AND CAUSE- 
Speciric In Burrato, 1939-41, Vol. 50, No. 272 (December 1955), 1235-47. 
pp. 1237, 1238, and 1240: Tables 1, 2, and 4 present indexes of mortality per 
100 (recorded) deaths and not per 1,000. 
p. 1242, line 17: Transpose “females” and “males” to read: “is greater for 
males than for females (Table 7). 








STATISTICAL ABSTRACTS 


All communications concerning this section should be addressed to the Ab- 
stracts Editor, Professor W. L. Smith, Department of Statistics, University of 
North Carolina, Chapel Hill, North Carolina. 


Anis, A. A., “On the moments of the maximum of 
partial sums of a finite number of independent 
normal variates,” Biometrika, 43 (1956), 79-84. 


Let Xi,-++ X, be a random sample from a 
normal population of zero mean and unit stand- 
ard deviation and let U, be the largest of the 
partial sums Xi, Xi+X2, X:+ X2+Xz, 
Xi+ +++ +Xn. The mean and standard de- 
viation of U, have been obtained in earlier pa- 
pers; in the present paper recurrence formulas 
are obtained for finding the higher moments. A 
table of the first four moments of U, for 
n=1,++-+, 15 is given; also the limiting values, 
as n becomes large, of 7 and 72 are calculated. 
D. R. Cox, University of North Carolina. 


Bailey, N. T. J., “On estimating the latent and 
infectious periods of measles. I. Families with 
two susceptibles only,” Biometrika, 43 (1956), 
15-22, 


¢ is assumed that after infection with measles 
there is a latent period approximately normally 
distributed followed by a period of infectiousness 
of constant duration and then by the appearance 
of symptoms. Families of two susceptibles are 
considered and it is supposed that infection of 
the second susceptible during the infectious pe- 
riod is governed by a Poisson process. 

The parameters in this setup are estimated by 
maximum likelihood from observations of the 
time interval between two cases of measles in 
families with two susceptibles. The distribution 
of this interval is formed partly from cases infect- 
ed simultaneously from a common source and 
partly from infection of the second susceptible 
in the family by the first. It is assumed in most of 
the paper that observations can be divided into 
these two types with negligible chance of error, 
the complications arising when this classification 
cannot be made being considered briefly. It seems 
to be assumed that independent infection of the 
two individuals from different sources at differ- 
ent times rarely occurs. D. R. Cox, University of 
North Corolina. 


Bartholomew, D. J., “A sequential test of ran- 
domness for events occurring in time or space,” 
Biometrika, 43 (1956), 64-78. 


Events occur haphazardly in time (or space) 
and the times at which they occur become avail- 
able in sequence. A sequential test is considered 
for the hypothesis that the events occur com- 
pletely randomly against the alternative that 
there is a particular type of smooth trend in the 
probability rate of occurrence. 

Tables and numerical examples are given. 
D. R. Cox, University of North Carolina. 


Bellerby, J. R., “Agricultural Income,” Journal 
of the Royal Statistical Society (A), 118 (1955), 
336-44. 


This article presents a detailed breakdown of 
farm income and outgo over many years past and 
a comparison of farmers’ net income to that of in- 
dustrial workers. It is of agricultural interest, but 
it presents nothing new to the field of mathe- 
matical statistics. R. H. Rirrensurau, Virginia 
Poiytechnic Institute. 


Bose, R. C., “Paired comparison designs for test- 
ing concordance between judges,” Biometrika, 43 
(1956), 113-21. 


A number of objects are available for compari- 
son using a number of judges. Each judge is given 
certain pairs of objects and asked to say which 
member of each pair is preferred. It is assumed 
that the number of objects is so large that it is 
impracticable for each judge to examine every 
possible pair of objects. 

Some designs with a high degree of symmetry 
are defined; they are somewhat analogous to bal- 
anced incomplete block designs. Properties of the 
designs are investigated and tables are given for 
certain cases. There is no discussion of methods 
of analysis. D. R. Cox, University of North 


Carolina. 


Broadbent, S. R., “Examination of a quantum 
hypothesis based on a single set of data,” Bio- 
metrika, 43 (1956), 32-44. 


Some observations 4, y2,***, Yn are availa- 
ble and it is suspected that they may have origi- 
nated from a quantum model y;=8+2r;d+e, 
where r; is zero or an integer and 26 is the con- 
stant quantum. The ¢; are errors of observation. 

If a possible value of 6, d say, is suggested a 
priori, agreement with it may be tested by the 
least squares statistic 


8?/d?=2(y;— 2rid)*/(nd*), 


where r; is the integer that minimizes | y;— 2rid|, 
{Broadbent, Biometrika, 42 (1955), 45]. The same 
statistic may be used when d has been derived 
from inspection of the data or by minimizing 
s*/d? for variations of d over a limited range, but 
the derivation of a formal significance test is then 
difficult. Numerical examples and the results of 
sampling experiments are reported. D. R. Cox, 
University of North Carolina. 


Burr, I. W., “Calculation of exact sampling dis- 
tribution of ranges from a discrete population,” 
Annals of Mathematical Statisticz, 26 (1955), 
530-2. 


This note gives a computational technique for 
calculating the exact sampling distribution of 
ranges for discrete universes having a finite range 
and approximating those for populations with an 
infinite range. An example, studying the effect on 
ranges of non-normality (skewness) ir the popu- 
lation, is given. W. J. Hari, Communicable Dis- 
ease Center, USPHS. 


653 








654 


Darwin, J. H., “The behaviour of an estimator 
for a simple birth and death process,” Bio- 
metrika, 43 (1956), 23-31. 


A population is known to have N» individuals 
at time ¢{=0 and its size is observed at times r, 
2r,--+-+, kr. The growth is assumed to be gov- 
erned by the simple birth-death process, i.e. an 
individual in the population is assumed to have 
a chance dt of giving birth to a new individ- 
ual and a chance wd of dying in any period 
(t, t+dt). 

The estimation of the growth rate e?-#" is 
considered, the estimate based on the population 
counts being compared with the maximum like- 
lihood estimate based on observations of the in- 
stants of occurrence of births and deaths. It is 
assumed that No is large. D. R. Cox, University 
of North Carolina. 


David, H. A., “On the application to statistics of 
an elementary theorem in probability,” Bio- 
metrika, 43 (1956), 85-91. 


Applications are given to the calculation of 
(i) the distribution of the extreme deviate from 
the sample mean, (ii) the power of the extreme 
deviate test for the rejection of an outlying ob- 
servation, (iii) the distribution of the maximum 
F-ratio. Interesting numerical examples are 
given. D. R. Cox, University of North Carolina. 


Finney, D. J., “The statistician and the planning 
of field experiments,” Journal of the Royal Sta- 
tistical Society, (A), 119 (1956), 1-27. 


This paper was read before the Royal Statisti- 
cal Society on November 16, 1955. The author 
examines and discusses from his point of view the 
responsibilities of statisticians relative to the en- 
tire structure and conduct of research programs. 
An ideal pattern of statistical service to research 
is presented. The general aim of this paper is “to 
provoke thought, discussion, and even disagree- 
ment.” Some interesting discussion follows Fin- 
ney’s paper. H. A. Sriiu, Virginia Polytechnic 
Institute. 


Grad, A., and Solomon, H., “Distribution of 
quadratic forms and some applications,” An- 
nals of Mathematical Statistica, 26 (1955), 464-77. 


The authors discuss the distribution of a posi- 
tive definite quadratic form in normal variables 
(Qe= Zi ayx?, Da;= 1, a;>0, and the z;’s are in- 
dependent normal zero-one variables) and three 
approximations to the distribution. They pre- 
sent tables of the exact distribution for k= 2 and 
3 and compare some of the values obtained by 
the approximate methods. Applications in two 
and three dimensions are discussed, particularly 
concerning hit probabilities in military opera- 
tions. W. J. Hatt. Communicable Disease Cen- 
ter, USPHS. 


Grebenik, E., “Population and vital statistics,” 
Journal of the Royal Statistical Society (A), 118 
(1955), 452-62. 


This paper is a rather detailed historical note 
dealing with the evolution of population and 
vital statistics recorded in England and Wales. 

The author describes the officials responsible 
for gathering the data and gives a summary of 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


the laws enacted to assist in this work. Among 
the areas given special attention are: population 
census, vital registration, mortality statistics, 
and marriage and divorce statistics. 

A brief description of some of the problems in- 
volved in making estimates of certain su 
of the population is given and some proposals are 
made to reduce inaccuracies which exist at pres- 
ent. Ricnarp A. Stewart, Virginia Polytechnic 
Institute. 

Haldane, J. B. S., and Smith, S. M., “The sam- 
pling distribution of a maximum likelihood esti- 
mate,” Biometrika, 43 (1956), 96-103. 

Let the data consist of the numbers of obser- 
vations falling in a series of groups, and let the 
probabilities associated with these groups be 
given with one unknown parameter. The first 
four cumulants of the maximum likelihood esti- 
mate of this parameter are obtained as asymp- 
totic series for large samples. D. R. Cox, Univer- 
sity of North Carolina. 


Hodges, J. L., Jr., “A bivariate sign test,” An- 
nals of Mathematical Statistics, 26 (1955), 523-7. 


This paper proposes a bivariate analog of the 
two-sided sign test. Considering a series of in- 
dividuals on which measurements are made 
under two conditions, the null hypothesis to be 
tested is that the measurements under the two 
conditions are independent and identically dis- 
tributed, but no assumptions other than inde- 
pendence are necessary concerning successive 
individuals in the series. In the univariate test, 
one measurement is made under each of the two 
conditions, and the alternative hypothesis is that 
the second measurements are generally shifted 
with respect to the first measurements. In the bi- 
variate test, two quantities are measured under 
each condition, and the alternative of interest is 
that under the second condition the bivariate 
distribution of the two measurements has been 
shifted relative to the first pair, in generally the 
same direction for all individuals, but the direc- 
tion of this possible shift is unknown. (If the di- 
rection were known, a univariate sign test could 
be used.) The idea of the proposed test is: it is 
judged that a shift has occurred if there is some 
direction in which most of the pairs of measure- 
ments have shifted, and that no shift has oc- 
curred if the shifts are in various directions with 
no heavy concentration. 

“To illustrate, suppose we measure blood 
pressure and blood sugar before and after treat- 
ment with a new drug on a number of individ- 
uals. We wish to know whether the drug influ- 
ences these quantities, but have no preconceived 
notion concerning the direction or relative amount 
of the influence on either quantity, should it ex- 
ist. The joint distribution of the quantities has 
an unknown form, and is presumably different 
in different individuals. The quantities are pre- 
sumably dependent, but in an unknown way.” 

The distribution theory for the proposed test 
statistic is worked out only under certain re- 
strictions which, however, allow performance of 
tests at the 5% significance level if the number 
of individuals in the series, n, is less than 72 and 
tests at the 1% level if n is less than 102. Tables 
are provided for n $30. Multivariate analogs are 





STATISTICAL ABSTRACTS 


briefly mentioned. W. J. Haut, 
Disease Center, USPHS. 


Homma, T., “On a certain queuing process.” 
Reports of Statistical Application Research, Japa- 
nese Union of Scientists and Engineers, 4 (1955), 
14-32. 


Let z(t) be the size of a classical single server 
queue in which the customers arrive “at random 
(exponential holding time with parameter A). In 
addition suppose that if the size is j an arriving 
potential customer joins the queue with proba- 
bility pj, 1=pmzn2zmz= = p=limj... pj. If 
the times of departures of the served customers 
are denoted by 7:< 72<7; - - « the author claims 
that 2(7;+0)j=1, 2,- ++ isan irreducible Mar- 
kov chain. Questions as to whether the chain is 
ergodic, recurrent, transient, etc. are studied. T. 
Kawata proved that if the service time is expon- 
ential with parameter b, and \/b=p, the chain 
is ergodic if and only if p<1/p. The present au- 
thor proves the same result for a general service 
time distribution with finite nonzero mean. In 
addition he proves the chain is recurrent and null 
if p;= 1/p for sufficiently large j, and is transient 
if p>1/p. There is a gap when p= 1/p which the 
author notes, but does not resolve. The paper 
closes by studying the waiting time distribution 
when one has exponential service time and p<1/ 
p (ergodic case). D. A. Darurne, University of 
Chicago. 


Communicable 


Karpinos, Bernard D., and Grossman, Harold A., 
“Prevalence of left-handedness among selective 
service registrants,” Human Biology, 25 (1953), 


3649. 


The replies of 12,159 registrants have been 
classified according to left- or right-hancedness, 
qualified or disqualified (for induction into army 
service), and geographical area (six army areas). 
Simple chi-square tests have been applied indi- 
cating some relationship between the prevalence 
of left-handedness and qualification grouping 
(the prevalence being higher in the disqualified 
group) but little relationship between the preva- 
lence of left-handedness and geographical area. 
D. B. Duncan, University of Florida. 


Kendall, M. G., and Lawley, D. N., “The prin- 
ciples of factor analysis,” Journal of the Royai 
Statistical Society (A), 119 (1956), 83-4 


To some extent factor analysis formally resem- 
bles principal component analysis. This has re- 
sulted in some confusion of the two methods even 
though their aims are entirely different. To dispel 
this confusion the authors briefly explain com- 
ponent analysis as a method in which “a linear 
and orthogonal transformation is applied to the 
p variates X;, Xo, - , X» to produce a new set 
of uncorrelated variates #1, y2, ** * ,Yp-” The y’s 
are chosen so that y; has maximum variance, y2 
has maximum variance subject to being uncorre- 
lated with y:, and so on. In contrast to compo- 
nent analysis in which no assumptions need be 
made on the X’s the authors note that factor 
analysis requires the basic assumption that 
Xi=Zpar™ lirfet+es(i=1, 2,+++, p), where f, is 
the rth factor and e is a residual representing 
sources of variation affecting only the variate X;. 
The e are assumed independent of one another 


655 


and of the f,. The latter are assumed orthogonal 

A brief discussion of this model indicates the hy- 
pothesis to be tested and some of the difficulties 
encountered in estimating the l’s. R. N. Penper- 
Grass, Virginia Polytechnic Institute. 


Liddell, F. D. K., “Colliery statistics,” Journal 
of the Royal Statistical Society (A), 118 (1955), 


» * 405-16. 


In this paper the author has tried to explain 
some of the difficulties involved in the interpre- 
tation of coal statistics and how they are being 
overcome. The most difficult problem is to find a 
reliable measure for general use of a colliery’s 
performance. However, collieries are so diverse 
in their nature that any study of their perform- 
ace is severely handicapped. The reason for 
such changes in performance may be legion and 
widely different action may be needed for im- 
provement. The author explains, however, that 
with the raw material now available in the form 
of the “Colliery Profile” and with the tools they 
are learning to use, the reasons for the very wide 
variations in performance can be analyzed and 
statistical problems should become more tracta- 
ble. Josern F. Guissier III, Virginia Polytech- 
nic Institute. 


Oertel, A. C., and Cornish, E. A., “The frequency 
distribution of the spectrographic (D.C. arc) er- 
ror,” Australian Journal of Applied Science, 4. 
(1953), 489-507. 


“A sample of 1161 errors, derived from 259 
sets of triplicate and 64 sets of sextuplicate meas- 
urements of line intensity, has been used to de- 
termine the frequency distribution of the spec- 
trographic error which occurs in the D.C. are 
excitation of samples of soil and plant ash. It 
has been found that the error is, in the greater 
part at least, a proportional one, and when ex- 
pressed on a logarithmic scale, its frequency dis- 
tribution, so far as can be ascertained from these 
data, is normal. It follows that measures and 
tests of significance based upon the hypothesis of 
normality are applicable in spectrographic (D.C. 
arc) investigations. Additional data are being ac- 
cumulated to provide the means for making more 
comprehensive tests of the nature of the fre- 
quency distribution.” Authors’ summary. D. B. 
Duncan, University of Florida. 


Pillai, K. C. S., “On the distribution of the larg- 
est or the smallest root of a matrix in multivari- 
ate analysis,” Biometrika, 43 (1956), 122-7. 


An approximation is given for the upper per- 
centage points of the distribution of the largest 
characteristic value of sample covariance ma- 
trices arising in multivariate analysis. Numerical 
examples are given for the bivariate case. D. R. 
Cox, <‘xiversity of North Carolina. 


Prois, S. J., “Measuring social mobility,” Jour- 
nal of the Royal Statistical Society (A), 118 (1955), 
56-66. 


From an earlier work by D. V. Glass, a table of 
statistics is set up using occupation as the basis 
for status derived from a series of interviews to 
form a relation between the social status of fa- 
thers and of their sons. This statistic is arranged 
in a transition matrix where the coefficients of the 





656 


matrix are regarded as giving the probability of 
a family’s transition from one social status to an- 
other. 

From the transition matrix, using an approach 
from the Theory of Markov Chains, the author 
develops an equilibrium distribution for the so- 
cial classes over an infinite number of genera- 
tions. The average time which a family spends in 
one social class is tabulated and the variation of 
mobility and immobility within time are calcu- 
lated. The author offers some extension of his 
work to refine the transition matrix and makes 
suggestions as to further work which might be 
done. 

The derivations of the mean and variance of 
the times spent in a social class are contained in 
Appendix B, and a correction for shifts in the oc- 
cupational structure are in Appendix A. J. F. 
GRANINGER, JR., Virginia Polytechnic Institute. 


Putter, Joseph, “The treatment of ties in some 
nonparametric tests,” Annals of Mathematical 
Statistics, 26 (1955), 368-86. 


The author compares “randomized” and “non™ 
randomized” methods of treating ties in rank or- 
der tests, and shows that randomization reduces 
both the exact power and the asymptotic effi- 
ciency of the one-sided sign test and reduces the 
asymptotic efficiency of the Wilcoxon two-sam- 
ple test. W. J. Hatt, Communicable Disease Cen- 
ter, USPHS. 


Schaeffer, M. S., and Levitt, E. E., “Concerning 
Kendall’s tau, a nonparametric correlation coef- 
ficient,” Psychological Bulletin, 53 (1956), 338- 
46. 

In this expository article the writers define tau 
and show how to adjust for tied ranks. They dis- 
cuss tests of significance, corrections for continu- 
ity, and confidence limits and provide a computa- 
tional example illustrating these. Finally, they 
comment on partial rank correlation and the 
relationship between tau and product-moment 7. 
Juuian C. Staniey, University of Wisconsin. 


Utting, J. E. G., “National income and related 
statistics,” Journal of the Royal Statistical Society 
(A), 118 (1955), 434-51. 


Unlike most official statistics, official nationa! 
income figures are estimates. There exists no 
unique definition of national income. The British 
theorists use the concept of the output of goods 
and services becoming available to the nation 
during a given period. The money payment of 
the national income is the sum of payments to 
the factors of production in the period concerned. 
The national income can be broken up into such 
categories as “Output and Expenditures,” “In- 
dustrial Input and Output,” etc., and statistical 
analyses made of these categories. CATHARINE 
Howakp, Virginia Polytechnic Institute. 


Williams, E. J., “Simplified calculations for the 
estimation of gene frequencies for the rhesus fac- 
tor and an application to partially classified 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


data,” The Medical Journal of Australia, 1 (1949) 
No. 3. 


This is « two-page note describing a modifica- 
tion of the general method for estimating gene 
frequencies given by R. A. Fisher, “The fitting 
gene frequencies to data on rhesus reactions,” 
The Annals of Eugenics 13 (1946), 150. The modi- 
fication effects a reduction in the computing work 
involved. A clear and detailed example is given 
in which the method is applied to rhesus reac- 
tions of 558 New Caledonian aborigines, some of 
the reactions being only partially classified. 
D. B. Duncan, University of Florida. 


Wishart, J., “x? probabilities for large numbers 
of degrees of freedom,” Biometrika, 43 (1956), 
92-5. 

A number of comparisons are given which are 
useful for computing very accurate x? probabili- 
ties when the degrees of freedom are large. D. R. 
Cox, University of North Carolina. 


Wold, Herman, “Caueal inference from observa- 
tional data,” Journal of the Royal Statistical So- 
ciely, (A), 119 (1956), 48-50. 

In the handling of data both descriptive and 
explanatory problems arise. Both description and 
explanation may be treated on the basis of ex- 
perimental or observational data. This paper ex- 
amines the differences in explanatory procedures 
when dealing with experimental and observa- 
tional data. “It is argued that the theoretical 
model on which the statistical analysis is based 
can be specified with less exactness and detail 
wnen passing from experimental to observational 
data, and that this leads to a partial re-orienta- 
tion of the statistical techniques. The shifts of 
emphasis are examined with regard to specifica- 
tion of hypotheses, estimation procedures, and 
hypothesis testing. The conclusion reached is 
that with observational data the statistical anal- 
ysis becomes more dependent on (a) co-ordina- 
tion with subject-matter theory, (b) large-sample 
methods, and (c) checks and tests against other 
evidence. This conclusion is illustrated with ref- 
erence to four types of approach.” R. E. Wat- 
POLE, Virginia Polytechnic Instituie. 


Report of a Working Group, “Statistical assess- 
ment of the need for clinical services,” Journal of 
the Royal Statistical Society (A), 118(4), (1955), 
427-33. 

The report of a working group of the Commit- 
tee of the Study Circle of Medical Statistics is 
presented. In this report, the group gave existing 
conditions for medical services in Britain, indi- 
eating the hospital facilities available, the use 
made of these facilities, and the number of per- 
sons on waiting lists for each type of service. The 
pressing problem was to estimate the need for 
hospital facilities in the future. They suggested 
methods for assessing the need independently of 
existing services and considered existing sources 
for information on the problem. Jean Larry, 
Virginia Polytechnic Institute. 





BOOK REVIEWS 


Faster, Faster: A Simple Description of a Giant Electronic Calculator and the Problems 
It Solves. W. J. Eckert and Rebecca Jones. New York: McGraw-Hill Book Company, 
Inc., 1955. Pp. viii, 160. $3.75. 


See review article by Thornton Fry, pp. 565-75 in this issue. 


The Foundations of Statistics. Leonard J. Savage. New York: John Wiley and Sons, 1954. 
Pp. xv, 294. $6.00. 


F. J. ANscomBr, University of Cambridge* 


ESPITE the vastness of the literature on the theory of statistics and its applica- 

tions, there is little to which a professional statistician can turn with hope of 
enlightenment on the fundamental ideas of his subject. If one were to compile a list 
of possible sources, one would begin, no doubt, with the works of R.A. Fisher, includ- 
ing his new Statistical Methods and Scientific Inference, and one might continue with 
the thoughtful criticism of Fisher’s ideas in H. Jeffreys’ Theory of Probability. High 
up on the list after that would surely come this book by Savage. 

The first section of the book (Chaps. 1—5) establishes a theory of consistent 
rational decisions made by a person in the face of uncertainty; the notions of personal 
(subjective) probability and utility are defined and explored. The ground covered 
is roughly that of the middle portion of F. P. Ramsey’s essay on Truth and Prob- 
ability (in The Foundations of Mathematics, London, 1931), but the manner is very 
different. Ramsey’s essay is easy to read superficially and gives only a sketch of a 
formal theory; he writes: “I have not worked out the mathematical logic of this in 
detail, because this would, I think, be rather like working out to seven places of 
decimals a result only valid to two. My logic cannot be regarded as giving more than 
the sort of way it might work.” Savage has gone to all of the seven places, and 
moreover uses an abbreviated mathematical notation to which one becomes ac- 
customed only with difficulty. (He suggests in the Preface that the reader should sit 
bolt upright on a hard chair at a desk, with pencil and paper. Those of us who prefer 
lying in deck chairs may expect trouble.) Savage’s treatment is based, not primarily 
on Ramsey’s, but on the work of B. de Finetti in subjective probability and on the 
von Neumann-Morgenstern theory of utility. The upshot is to prove that any person 
who can express preferences between available acts so as to satisfy certain postulates 
implying what seems to be a mild and reasonable kind of consistency and rational 
behavior—any such person will act as though he attaches numerical probabilities to 
relevant propositions about the unknown state of the world, and numerical utilities to 
the value of the possible consequences of the acts, and then choose an act that 
maximizes the expected utility. Savage’s postulates, expressing the consistency and 
rationality of behavior of the person, are remarkably weak, and are stated with great 
clarity. This section seems to the reviewer to be the most completely successful 
and stimulating part of the book. It offers a serious challenge to anyone who asserts 
that subjective probabilities are meaningless. The very features of the treatment that 
make it rather hard to read will no doubt commend it widely, now that Ramsey’s 
type of lucidity is out oi fashion. 

The next two chapters, completing the first half of the book, discuss the notion 
of a decision depending on the outcome of an act of observation. They form a bridge 
to the second half of the book, in which some current statistical theory is examined 
in the light of the preceding ideas. The author considers that an essential characteris- 
tic of statistical methods is that they refer or lead to decisions made, not by a single 





* Now at Princeton University. 


657 





658 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


person, but by a group—or by a single person who needs to persuade a group. He 
even suggests the “very tentative” definition that statistics proper is just ‘the art 
of dealing with vagueness and with interpersonal difference in decision situations.” 

The author focuses attention on A. Wald’s minimax principle. He gives an account 
in customary objectivistic terms, together with a sketch of relevant games theory, 
and also in parallel develops a subjective interpretation in terms of decisions made 
by a group. This latter interpretation he has previously outlined in a paper in this 
journal, Vol. 46 (1951), pp. 55-67. It is supposed that the members of the group differ 
in their probability judgments, but not in their value judgments (utilities), and it is 
suggested that such a group might consider the minimax principle to be an acceptable 
way of reaching a compromise. So far as the reviewer is aware, this is the only in- 
telligent attempt at justifying the minimax principle made by anyone, and it is a 
most interesting idea. Unfortunately, many real problems of group decision are com- 
plicated by differences in value judgments, but this much is well worth having. (An 
interesting step towards solving the more general problem in which the utilities of 
different persons are not directly comparable has since been taken by R. B. Braith- 
waite, Theory of Games as a Tool for the Moral Philosopher, Cambridge, 1955.) 

The last three chapters of the book discuss point and interval estimation and signifi- 
cance tests, with main emphasis on their relation to the minimax principle. Unlike the 
rest of the book, these chapters do not expound a positive idea but are concerned 
with commenting (at not too deep a level) on the ideas of others. Here, as everywhere 
else, interesting details abound. 

Of any new book it is natural to ask two questions: (1) What is the point of it; 
where does it fit in? (2) How well does it succeed in doing what might reasonably be 
expected of it? In the present case neither question is quite simple to answer. As 
to the first, the book does not really come under the heading of Statistical Method- 
ology, but rather under that of Scientific Reasoning. A sufficiently developed theory 
of scientific reasoning would be the ground from which statistical methods intended 
for practical use could be properly evaluated. Savage has made a fine contribution 
to this study, one which should assist statistical theorists and philosophers for many 
years to come. At first the title of the book may seem a little misleading—until one 
reflects that anyone interested in the foundations of statistics will certainly need to 
read it. 

As for how well the book succeeds in its task, the answer must be “excellently as 
far as it goes.” Savage’s notion of group decisions may be expected to have lasting 
importance, even if the minimax principle does not. Ultimately, however, most criti- 
cal discussion is likely to center on the treatment of one-person decisions. The essen- 
tial character of Savage’s theory is determined by his first postulate, that preferences 
establish a simple ordering among acts, and by the definitions of terms, such as 
“state of the world,” preceding this postulate. Although the author discusses his 
postulates with a care and persuasiveness rare among mathematicians, much more 
discussion is needed of this first postulate. This review is not the place for such dis- 
cussion, but the reviewer would like to sketch a line of thought on the subject. A dis- 
tinction should be drawn between those decision situations where it is reasonable to 
suppose that Savage’s first postulate is (approximately) satisfied and those where 
it is not, and the reasons for the failure in the latter case need to be examinec|. The 
distinction will be found, I think, to correspond to the distinction between two types 
of inference situation encountered in the statistical analysis of data: (1) where it is 
reasonable to postulate a class of “admissible” simple hypotheses (usually labeled by 
the values of one or several parameters), and to assert that this class contains the 
truth; and (2) where what is in question is whether some class of simple hypotheses 





BOOK REVIEWS 659 


(usually now called a composite null hypothesis) contains the truth, this class not 
being imbedded in a wider class of well-defined “admissible” hypotheses asserted to 
contain the truth. In the first type of situation Bayesian methods are available, and 
the information contained in the observations is summed up completely in the likeli- 
hood function. The likelihood function compares the alternative hypotheses in the 
light of the given observations; comparisons are made in “parameter space,” not in 
“sample space.” In the second type of situation, we are interested to compare the 
observations with what might have been expected if the null hypothesis were true; 
comparisons are made in “sample space,” and tests of goodness of fit result. Thus tests 
of goodness of fit differ essentially from Bayesian decision procedures, and so from 
the whole of current decision theory, which, whether openly Bayesian or not, postu- 
lates a not-too-large class of well-defined admissible hypotheses. 

To sum up, if there is any serious flaw in Savage’s argument, it is one of excessive 
definiteness at the outset. Savage’s theory of decisions is adequate for the discussion 
of many, but not all, problems in statistical methodology. Questions of goodness of 
fit of a scientific theory to observations are outside its scope. 


Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability. 
Jerzy Neyman, Editcr. Vol. III: Contributions to Astronomy and Physics; Vol. IV: 
Contributions to Biology and Problems of Health; Vol. V: Contributions to Econometrics, 
Industrial Research, and Psychometry. Berkeley and Los Angeles: University of Cali- 
fornia Press, 1956. Pp. viii, 252; ix, 179; ix, 184. Vol. III, $6.25; Vols. lV and V, $5.75 each. 


HE third Berkeley Symposium on Mathematical Statistics and Probability was 
held in two parts, one from December 26 to 31, 1954, emphasizing applications, 
and the other, in July and August, 1955, emphasizing theory. ... All papers pub- 


lished in these Proceedings were written at the invitation of the Statistical Labora- 
tory. . . . However, in no case was there any pressure on the authors to introduce any 
material change into their work [before publication].” 

Volume III under Contributions to Astronomy has the following articles in a sec- 
tion on Hertzsprung-Russeil Diagram: Olin J. Eggen—The Relationship between the 
Color and the Luminosity of Stars near the Sun; Jesse L. Greenstein—The Spectra and 
Other Properties of Stars Lying Below the Normal Main Sequence; Harold L. Johnson— 
Photoelectric Studies of Stellar Magnitudes ard Colors; Gerald E. Kron—Evidence for 
Sequences in the Color-Luminosity for M-2warfs; Bengt Strémgren—The Hertz- 
sprung-Russell Diagram; and in a section on Spatial Distribytion of Galaxies has: 
G. C. McVittie—Galazies, Statistics and Relativity ; Jerzy Neyman, Elizabeth L. Scott 
and C. D. Shane—Statistics of Images of Galaxies with Particular Reference to Cluster- 
ing; and F. Zwicky—Sitatistics of Clusters of Galaxies; under Contributions to Physics 
has: André Blanc-Lapierre and Albert Tortrat—Statistical Mechanics and Probability 
Theory; M. Kac—Foundations of Kinetic Theory; J, Kampé De Feriet—Random So- 
lutions of Partial Differential Equations; Elliott Montroll—Theory of the Vibration of 
Simple Cubic Lattices with Nearest Neighbor Interactions; and Norbert Wiener—Non- 
linear Prediction and Dynamics. 

Volume IV under Contributions to Biology lists: James Crow and Motoo Kimura— 
Some Genetic Problems in Natural Populations; Everett R. Dempster—Some Genetic 
Problems in Controlled Populations; Jerzy Neyman, Thomas Park and Elizabeth L. 
Scott—Struggle for Existence; and under Contributions to Problems of Health lists: 
M. 8. Bartlett—Deterministic and Stochastic Models for Recurrent Epidemics; A. T. 
Bharucha-Reid—On the Stochastic Theory of Epidemics; Chin L. Chiang, J. L. Hodges, 
Jr. and J. Yerushalmy—Statistical Studies in Medical Diagnoses; Jerome Cornfield— 
A Statistical Problem Arising from Retrospective Studies; David G. Kendall—Deter- 





660 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


ministic and Stochastic Epidemics in Closed Populations; and William F. Taylor— 
Problems in Contagion. 

Volume V under Contributions to Econometrics lists: Kenneth J. Arrow and Leonid 
Hurwicz—Reduction of Constrained Maxima to Saddle-Point Problems; Edward W. 
Barankin—Toward an Objectivistic Theory of Probability ; C. West Churchman—Prob- 
lems of Value Measurement for a Theory of Induction and Decisions; Patrick Suppes— 
The Role of Subjective Probability and Utility in Decision-Making ; under Contributions 
to Industrial Research lists: Albert H. Bowker—Continuous Sampling Plans; Cuth- 
bert Daniel—Fractional Replication in Industrial Research; Milton Sobel—Sequential 
Procedures for Selecting the Best Exponential Population; and under Contributions to 
Psychometry lists: T. W. Anderson and Herman Rubin—Statistical Inference in Fac- 
tor Analysis; Frederick Mosteller—Stochastic Learning Models; and Herbert Solo- 
mon—Probability and Statistics in Psychometric Research: Item Analysis and Classifi- 
cation Techniques. D. D. F. 


Introduction to Statistics. Frederick C. Mills. New York: Henry Holt and Company, 
1956. Pp. xv, 637. $6.00. 


Epwin S. Miuus, Massachusetts Institute of Technology 


HIs book is an abridgement of the recently published third edition of Professor 
Mills’ Statistical Methods, reviewed in this journal, Vol. 51, (1956), p. 376. The 
main omissions are two more advanced chapters on nonlinear and multiple regression, 
two chapters on the chi-square distribution and the analysis of variance, and the 
appendixes relevant to these chapters. In addition there are minor deletions of more 
difficult discussions from several chapters, none of which impairs the continuity of the 
exposition. Altogether the larger volume is shortened by 175 pages of text and 25 
of appendixes, leaving 525 pages of text and 92 of appendixes in the new volume. 
The advantage of the abridgement is of course its cheapness, though the marginal 
cost to the purchaser of the extra 200 pages is only $1.50 (the publisher reports that 
the price of Statistical Methods is now $7.50) and this must be weighed against the 
marginal utility of the deleted material. Even the shorter volume is probably too 
long for the “short” course for which it is intended and some selection will still be 
necessary. The chapters which have been deleted are, however, ones which would 
certainly be unsuitable for a one-semester course based on a textbook of this type and 
the shorter volume may thus serve a useful purpose. 


Business and Economic Statistics. William A. Spurr, Lester S. Keilogg, and John H. 
Smith. Homewood, Illinois: Richard D. Irwin, Inc., 1954. Pp. xi, 580. College price— 
$6.50. 


Gottrrigep E, Nonruer, Boston University 


HIs book is designed for a basic course in statistics offered in departments of 

business, economics, anu. general social science with emphasis on the use of 
statistical methods in business and economics, rather than on theory. The table of 
contents indicates the general scope of the book: 1. Statistics in Business and Eco- 
nomics. 2. How to Handle Numbers. 3. Use of Research Sources. 4. Collection of 
Original Data. 5. Methods of Selecting Samples. 6. Tables. 7. How to Construct 
a Chart. 8. Common Types of Charts. 9. Frequency Distributions. 10. Averages. 
11. Dispersion and Skewness. 12. Index Numbers. 13. Some Important Indexes. 14. 
Analysis of Business Fluctuations. 15. Secular Trend. 16. Seasonal and Cyclical Varia- 
tions. 17. Simple Correlation. 18. Multiple Correlation. 19. Reliability of Statistical 
Measures. 20. Statistical Quality Control. (The last chapter was written by Frank J. 
Williams and David 8. Chambers.) Each chapter closes with an extensive summary, 
list of problems (without answers), and references for additional reading. Besides 





BOOK REVIEWS 661 


the tables usually found in elementary statistics texts, there is a list of 127 selected 
sources of business statistics arranged according to issuing agencies. 

According to the authors the present volume was first undertaken as a revision of 
Brumbaugh and Kellogg’s Business Statistics (Chicago: Richard D. Irwin, Inc., 
1941); however, developments in statistical techniques and applications since World 
War II necessitated a completely new presentation. Under the circumstances, it is 
regrettable that the authors have not come up with a more modern treatment of 
the subject. 

The reviewer likes to think that it is now quite generally accepted that one of the 
important purposes of an elementary course in statistics is to teach the basic ideas of 
statistical inference. The authors of the present volume seem to agree with this 
requirement since they state in the preface that “principles of statistical inference 
are emphasized throughout.” While statistical inference is mentioned in Chap. 5 on 
Methods of Selecting Samples, for all practical purposes, it is then lost sight of until 
Chap. 19 on the Reliability of Statistical Measures. 

Essentially then, this is a book dealing almost exclusively with what is usually 
referred to as descriptive statistics. Even as such, it contains some annoying mis- 
information. Thus, it is repeatedly claimed that the interval #+o (o is used through- 
out the book to denote the sample standard deviation, while the population standard 
deviation is denoted by o’, though not consistently) contains 68 per cent of all items 
(see, particularly, p. 233). Again, on p. 389, “. . . when all points are on a curve, the 
correlation is said to be perfect.” 

Throughout the book, there is considerable emphasis on “short-cut graphic 
methods of analysis.” While such methods may seem attractive to some, they cer- 
tainly do not allow for statistical inference because of their subjective nature. To 
the reviewer, the problem of graphic methods is well illustrated by a quotation from 
p. 400: “In order to show the line which the graphic process is designed to estimate, 
the regression line AB (fitted by the method of least squares in a later section) is 
drawn on Chart 17-3.” 

As far as Chap. 19 is concerned, the main emphasis is on the use of standard errors 
as “indicators of reliability.” While the necessary steps for carrying out the more usual 
tests of hypotheses are indicated, no attempt is made to discuss the general nature 
of tests, nor ev2n the assumptions underlying each particular test. The relative 
unimportance placed by the authors on statistical inference is perhaps best illustrated 
by the fact that the list of references for Chap. 19 contains only three books, pub- 
lished in 1939, 1942, and 1950, respectively, the third actually being a new edition 
of a much older book. 

The chapter on statistical quality control offers enjoyable reading. 


Fundamental Statistics in Psychology and Education, New Third Edition. J. P. Guilford. 
New York: McGraw-Hill Book Co., Pp. xii, 565. $6.25. 


Rosert R. Busn, New York School of Social Work, Columbia University 


His book is a revision of a standard conventional text on introductory applied 
" Feasteate. Like the earlier editions, this “new third edition” begins with descriptive 
statistics and then treats the normal distribution, correlation methods, hypothesis 
testing, and measurement theory. The author has considerably condensed the ma- 
terial on measurement and scaling which appeared in the earlier editions in order to 
reduce duplication with his other text, Psychometric Methods. 

Additions to the revision are rather minor and therefore disappointing in view 
of the rapid developments in statistics during recent years. The discussions of non- 
parametric methods are few and brief. The problem of comparing individual means 
in analysis of variance designs is barely mentioned after the student is incorrectly 





662 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


advised to carry out multiple t-tests (p. 264). This is unfortunate in view of the 
modern developments of Tukey, Link and Wallace, and others. 

Short-cut methods involving the range are not included. In fact, the author 
repeats the familiar but inaccurate remark that the range is unsatisfactory because 
“only two measurements are used to determine it” (p. 79). However, an abridged 
form of Snedecor’s table of the ratio of the range to the standard deviatidn is given, 
and it is suggested that it can be used as a “rough check” on computations of the 
standard deviation. 

Another subject which is inadequately treated is the power of tests. This problem 
is but briefly mentioned (pp. 189, 217) and the reader is given no help in estimating 
the probabilities of Type II errors. Sampling problems and methods are described in 
a few scattered places but no unified treatment is offered. In this respect, Guilford’s 
book is very different from the new Wallis-Roberts text, Statistics: A New Approach, 
which introduces sampling as the central theme. 

No material on estimation or on sequential problems is included. This is particularly 
unfortunate in view of the increasing use of mathematical models in psychology. In 
the opinion of this reviewer, estimation of parameters in psychology may soon become 
as important as testing of hypotheses. Furthermore, the very nature of psychological 
data has made many people aware of the importance of sequential analyses. 

In summary, Guilford’s revision presents nothing new for teaching or research. 
Courses in statistics for psychology and education need considerable revision but 
Guilford’s book will be of no help. 


Curso de Estadistica. Volumen I: Estadistica Descriptiva y Modelo Matemitico. P. 
Enrique Chacén, S. I. Bilbao, Espafia: El Mensajero del Corazén de Jestis, 1955, Pp. xvi, 
494. 200 ptas. 


Jorce Arts B., Universidad de San Carlos de Guatemala 


CCORDING to the author, the book, that will consist of two volumes, is not a 
treatise on mathematical statistics; but in fact it has more mathematics than 
many English books on mathematical statistics. This is true even though mathematical 
proofs are given only when necessary for understanding the statistical framework. 
The author’s intention is “to expound all those statistical methods related to oper- 
ational research,” except problems of econometrics, with emphasis on industrial 
applications. In 419 pages he includes an accumulation of theory, in certain respects 
encyclopedic, that sometimes makes it difficult to see what is common in many of the 
measures. But I think that for those who have already some grasp of statistical 
theory, this book is invaluable, and constitutes a great contribution to the scarce 
material written in Spanish. The language used is very clear, the typographical pres- 
entation is good, and taking into account the great number of mathematical formu- 
las, the errata found by the reviewer are very few. 

The first three chapters (18 pages) include the concept and definition of statistics, 
probability and its basic postulates, and a brief introduction to statistical data, with 
groupings and graphical presentation. 

The fourth chapter takes the student to the principal statistics of frequency dis- 
tributions, using moments. Chap. 5, entitled “Mathematical Models” treats moments 
in a more general form, as well as the generating moment functions and cumulants. 
It undertakes the study of discrete and continuous variables. The next three chapters 
study specific distribution functions, such as those of discrete variables (binomial, 
Poisson, and hypergeometric), of continuous variables (normal) including the con- 
struction and use of probability paper, and continuous distribution functions (gamma 
and beta) and derived functions, such as x’, B,, 82, Student’s “t”, and Fisher’s “F”. 








BOOK REVIEWS 663 


The following four chapters study bivariate distributions. Chap. 9 treats the 
regression line with two independent variables, with one or two variables with 
random components; Chap. 10 includes statistics of bivariate distributions (moments, 
correlation index, etc.) and some other measures, such as rank, intraclass, tetra- 
choric, and biserial correlation. The study of mathematical models and distribution 
functions of two variables (discrete and continuous) forms the content of Chaps. 11 
and 12. Finally Chaps. 13 and 14 study the case of n variables, including canonical 
correlation, discriminant functions, and mathematical models. These chapters are 
illustrated with problems in econometrics. 

The analysis and correlation of time series are the subjects of the three following 
chapters. The treatment is more or less the same as that in elementary statistics 
textbooks in English, including an introduction to harmonic analysis for measuring 
seasonal variation. It is regrettable that the level of these chapters is not in line with 
the level of the previous ones, and that they do not include, even as an introduction, 
the treatment of time series as stochastic variables. 

The last six chapters (134 pages) are dedicated to the study of moments and 
other measures (means, dispersion, etc.) of some statistics, giving first a discussion 
of the methods that can be used for obtaining such measures, and then applying 
the methods to different distributions: normal or asymptotically normal; distributions 
that do not follow the normal law; distributions of statistics when the variables 
follow any distribution; and finally the last chapter includes, in brief form, the 
generalizations of x, t, F, etc., for multivariable distributions. 

As can be seen from this summary of the contents, the scope of this first volume is 
quite ambitious. The second volume, whose appearance we await with interest, will 
complete one of the best contributions to statistics that has been made in Spanish. 
The second volume will include sampling theory, decision functions, hypothesis 
testing, distribution-free methods, analysis of variance, and will include applications 
to sampling of human populations, marketing investigations, quality control, accept- 
ance sampling, and design of experiments. 

The volume reviewed has many bibliographical notes, especially calling the atten- 
tion of the student to the original sources, but almost all of them are to publications 
in the United States or England. The reviewer would like to see also references to 
other European writings to which the author, without doubt, has ready access. 

A feature that also deserves attention is a valuable summary, in 16 pages, of the 
distribution of different statistics. The content of the table is as follows: the original 
variable, the variable subject to study, the corresponding parameter, the statistics 
to be used and the distribution, the mean, the variance, and the table to be used. 

Finally, this is not a beginner’s book. As background, the student needs formal 
training in elementary mathematics, calculus, determinants, and matrices. The 
author also assumes that the student has previously studied interpolation and least 
squares. The eramples developed in the body of the text are very few; but at the 
end of each chapter there is a collection of examples related to the material treated, 
although no answers are given. At the end of the book there is a collection of 25 
tables and three indexes: one of subject, one of symbols, and one of authors. 


Guide to Elementary Statistical Formulas. Robert E. Johnson and Doris N. Morris. 
New York: McGraw-Hill Book Company, Inc., 1956. Pp. vii, 101. $3.00. Loose-leaf. Paper. 


HERBERT ARKIN, City College of New York 


ype unusual booklet has as its stated objective the bringing together of “the 
elementary mathematical and statistical formulas which are frequently encoun- 
tered in business and which would be extremely useful if more readily available.” 





664 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


The contehts are arranged in tabular form with little text. For each term there is 
given a statement of the formula, definitions of the symbols involved, and an example 
of the computation. Page references to various texts are given at the end of the 
booklet. 

While the form of presentation is most useful for the purpose, the selection of the 
formulas included places heavy emphasis upon those of interest in general economic 
analysis. Almost one-half of the space is devoted to formulas relating to index num- 
bers and time series. An additional fifth of the contents is devoted to arithmetic and 
algebraic formulas. 

Many important areas in the purview of elementary statistical methods are treated 
in a cursory fashion or are omitted completely. Sampling formulas are not included 
at all. Correlation formulas are confined to simple linear correlation. 

While a book in this form with extensive coverage of the multiplicity of formulas 
encountered by the working statistician might well be useful as a reference volume, 
it is doubtful that this effort will serve its purpose well, due to its limited coverage 
and strange emphasis. 


Statistics: A New Approach. W. Allen Wallis and Harry V. Roberts. Glencoe, Ill.: The 
Free Press, 1956. Pp. xxxviii +646. $6.00. 


Wiuuiam G. Cocuran, Johns Hopkins University 


HIs book is a general introduction to statistics. It explains the basic ideas with 
beet more than American high-schoo! mathematics, and illustrates their uses 
in an astonishing variety of interesting and important applications. The book evolved 
from nine years’ teaching to miscellaneous groups of students, mostly sophomores 
and juniors and mostly in the social sciences, business, and economics. The book is 
large, deliberately containing more material than will be covered in a single beginning 
course, so as to allow flexibility in the choice of content for individual groups, while 
maintaining the same basic approach throughout. 

Chaps. 1 to 3 form a prelude to the more technical part of the book. In Chap. 1, 
statistics is defined as “a body of methods for making wise decisions in the face of 
uncertainty” (p. 3), or more generally as “a body of methods for obtaining knowi- 
edge” (p. 5), and the role of statistics in scientific method is discussed. In the second 
chapter, 26 effective uses of statistics are recounted briefly in order to give the 
student an appreciation of the wide scope of the subject in human affairs. The 
second half of this chapter, which I particularly liked, presents in detail three types 
of statistical inquiry: Goldhamer and Marshall’s attempt to discover whether the 
frequency of psychogenic psychoses is higher to-day than in the middle of the last 
century, a controlled experiment to test whether large supplements of the B and C 
vitamins would affect the physical performance of soldiers engaged in high activity 
under intense cold, and an experiment designed to study one aspect of the efficacy 
of rain-making. In contrast to Chap. 2, the next chapter exhibits a large selection 
of common niisuses of statistics, grouped according to the type of error committed. 

The basic technical content (Chaps. 4-14) pursues a fairly familiar line of de- 
velopment. The notion of sampling from a population and the inevitability of 
sampling variation are illustrated by sampling experiments with beads in Chap. 4, 
which also covers the nature and purpose of randomization. Then follows an intro- 
duction to problems of measurement, often neglected in courses on statistics. The 
user of data is urged always to keep in mind the question: “How did they get these 
data?” Next comes the standard material on descriptive statistics: frequency dis- 
tributions and charts (Chap. 6), averages (Chap. 7), measures of variability (Chap. 8) 
and the study of association by means of two- and three-way tables (Chap. 9). 





BOOK REVIEWS 665 


The succeeding five chapters deal with statistical inference. The additive and 
multiplicative laws of probability (Chap. 10) lead to the binomial distribution, 
presented arithmetically rather than algebraicaily. The normal distribution is intro- 
duced as an approximation to the binomial and then as an approximation to the 
distributions of sample means generally. In presenting the concept of a test of 
significance, perennially confusing to many students, the authors use a problem 
requiring a yes-no decision to make the argument more concrete. The usual tests 
of significance for the mean of a single population and for comparing two or more 
populations are given for both continuous variables and proportions, although 
reference to tables of t, F or x? is avoided by showing how to transform each of these 
variates approximately to a normal variate. To conclude this section, Chap. 14, on 
estimation, gives a glimpse of the method of maximum likelihood and presents con- 
fidence intervals for population means. 

The remaining section of the book, called “Speial Topics,” contains introductory 
accounts of the design of experiments, sampling methods, statistical quality control 
and acceptance sampling, regression and correlation, and time series. In teaching, 
topics from this section can presumably be chosen to suit the subject-matter interests 
of the class. The final chapter, on shortcut methods, is mainly taken up with the 
analogues of the ¢-, F- and r-tests based on ranks, and with the uses of binomial 
probability paper. 

A series of examples for the student is included at the end of each chapter. Like 
the examples presented as illustrative material within the text, these exercises are 
outstanding in their variety and capacity for mental stimulation. 

On picking up any book entitled: Statistics: A New Approach, one’s first reaction 
is to ask “What’s new about it?” In the preface, the authors are disarming about 
this claim to novelty, suggesting that it is “nebulous and ambiguous,” and may 
imply merely that statistics itself represents a new approach to knowledge and 
practical action, although they do list what they consider the distinctive features of 
the book. In this reviewer’s opinion, the characteristics that render the book worthy 
of attention are principally the following: 

(1) The authors make good on their claim to present the basic statistical ideas 
with a minimum of mathematics. The exposition is consistently lucid, yet the prin- 
ciples are in no way watered down or distorted to ease the task of the writers. The 
authors have wisely chosen to take the space necessary to elaborate adequately 
on the background and the implications of the principles—this is partly why the 
book is large. 

(2) The number and variety of examples, already alluded to, is unequalled in any 
statistics book that I know. The examples are used with great skill in the presentation 
of principles and methods, and their range and interest should do much to enlist 
the student’s enthusiasm and give him a vision of the importance of the subjeci. 

(3) Although written for the nonspecialist in statistics, the book supplies an 
unusually faithful insight into the nature of statistical work, both the exciting and the 
tedious parts, and drives home the lesson that statistics is everybody’s business. 

I read this book with a vigilant eye for defects, believing that when the editor of 
the Journal is a co-author, the reviewer should err, if err he must, on the side of 
severity rather than leniency. I had a poor return for my pains. I am, however, one 
of those who will object to the opening sentence of the book, previously quoted, which 
views statistics as a body of methods for “making wise decisions in the face of un- 
certainty.” Despite the contributions which statistics can make to problems involving 
decisions and the added insight that has come from looking at many of our standard 
techniques in terms of decision-making, this view of the function of statistics is 











666 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


unduly restrictive, and if it becomes widely adopted may have unfortunate conse- 
quences for the full development of the subject. Moreover, on reading the third 
sentence of the book: “Indeed, even the pioneers in statistical research have adopted 
it [this view] only within the past decade or so,” I am tempted to follow the authors’ 
advice and ask one of their favorite questions: “How can they really know?”, par- 
ticularly since Fisher, who receives a handsome accolade in the book as the pioneer 
par excellence, has recently expressed his disagreement with this point of view, though 
perhaps after this book went to print. On the other hand, the book itself gives no un- 
due prominence to decision-making as the sole, or even the principal, function of 
the subject. 

As regards content, I should have liked to see some expansion of Chap. 15 (“Design 
of Investigations”), which deals briefly with the design of experiments and of sample 
surveys, on the grounds that these are the two general techniques most likely to 
lead to more reliable and informative data in the future. 

To sum up, this is the best book that I have seen for a beginning course to a group 
of students from varied disciplines. It would be excellent as the basis of a “cultural” 
course in statistics. The student who has to learn by himself will find it a pleasure to 
read. How far it will supplant books written for more specialized audiences I do not 
know, but anyone who has to teach an introductory course is advised to take a look 
at it. 


Experimental Designs in Sociological Research, Revised Edition., F. Stuart Chapin. 
New York: Harper and Brothers, 1955. Pp. xii, 297. $4.50. 


Haro.tp F. Dorn, National Institutes of Health 


eviews of the first edition of this book appeared in the September 1948 and June 

1949 issues of this Journal. This edition contains the original chapters essentially 
unchanged plus four new chapters entitled “Analysis of Variance and the ¢-Statistic: 
Underlying Assumptions”; “Non-Parametric or Distribution-Free Statistical Meth- 
ods”; “The Ex Post Facto Design: Replication and Extension” ; and “Some Problems 
in Psycho-Social Measurement.” An appendix commenting upon reviews of the 
first edition also has been added. 

In spite of the severe criticisms of the first edition of this book, Chapin’s conception 
of experimentation and its role in sociological research has remained practically un- 
changed. Throughout his entire professional career Chapin has been a consistent 
advocate of the use of quantitative method in the study of social phenomena in actual 
community situations. The difficulties of doing this and the limitations of the re- 
sulting data are emphasized throughout this book. Nine studies chosen to illustrate 
the principles of experimental design in sociological research with respect to cross- 
section, prospective, and retrospective or ex post facto studies are described in detail. 

Admittedly there is po simple solution to the difficulties of carrying out social 
studies in actual community situations. Chapin’s ideal solution for these difficulties 
is the single factor experimental design wherein only one condition is permitted to 
vary and all other conditions are kept constant. Undoubtedly this method has been 
and still is a sound method for studying physical phenomena under laboratory 
conditions. But there is a great deal of experience indicating that this method not 
only represents an unattainable goal in the study of a wide range of social and 
biological phenomena but also that it is not necessarily the most useful method. In 
one of Chapin’s examples, the effort to control only six factors by individual matching 
reduced the study and control groups from 597 to 23 individuals each! In defense of 
this, Chapin argues that homogeneity, not representativeness, is the essential condi- 
tion to the discovery by a single experiment of a real relationship between two 








BOOK REVIEWS 667 


factors. If this be true, why not conduct all studies by comparing only two individuals 
chosen to have the maximum possible number of traits in common apart from the 
trait whose effect is to be measured? No answer to this question is found in this book 
nor are any guide lines given for deciding when individual matching has been carried 
too far. 

Chapin devotes a considerable amount of space to a discussion of the application 
of the usual tests of significance to data obtained from studies in which randomization 
is not possible and indicates that doubts concerning the applicability of these tests 
to data collected in sociological studies is a handicap to research. This discussion 
overemphasizes the role of statistical tests of significance in improving quantitative 
studies of social phenomena. The primary value of randomization in the modern 
theory of experimental design is that it provides a firm basis for the generalization 
of the results of studies. A statistical test of significance provides only one of the sev- 
eral bits of evidence required for the interpretation of data. It cannot compensate for 
failure to design a study so as to provide the information desired to answer the ques- 
tion posed nor for inability to measure adequately the phenomenon being investi- 
gated. 

The accumulation of scientific knowledge began long before the development of the 
modern theory of experimentation based upon randomization, replication, and 
factorial design. This theory, whenever it can be applied, provides the best protection 
against the misinterpretation of experimental results. However, many sciences must 
depend entirely upon observation of naturally occurring phenomena. Many aspects 
of social and biological behavior still are beyond the reach of experimentation. But 
the scientific study of such behavior need not therefore be abandoned. One might 
expect that the principles of investigational methods developed by generations of 
scientists engaged in the study of observational data at least would be mentioned 
in a book discussing methods of studying social phenomena. But such is not the case. 
A discussion of guiding principles for the collection and analysis of observational 
data and of safeguards against erroneous generalization is lacking in Chapin’s book. 

A footnote to the discussion of the theory and practice of the experimental method 
states that much of this chapter originally appeared in an article published in 1917. 
At that time, field investigations and studies in the social and biological sciences 
were in their infancy. Subsequent experience has modified concepts of experimental 
design and of methods of conducting field investigations of existing community 
situations. One looks in vain for full recognition of these developments in the revised 
edition of Experimental Designs in Sociological Research. 


Experimental Design: Theory and Application. Walter T. Federer. New York: The Mac- 
millan Company, 1955. Pp. xix, 591. $11.00. 


R. L. Anprerson, North Carolina State College 


LTHOUGH an unduly large number of books on statistical methods have been pub- 
lished recently, the number devoted strictly to the subject of experimental 
designs has remained relatively small. Hence there was need for a book of this type 
written by someone with the wide experience of Federer. This book is well written, 
with few of the printing errors which usually attach themselves to first editions. 
A tremendous variety of topics is considered, many illustrated with “live” data. 
The author presents the theory and use of the standard field designs: completely 
randomized designs and complete and incomplete randomized blocks and _ latin 
squares. He properly differentiates between blocking methods and treatment arrange- 
ments, and devotes several chapters to factorial arrangements. The usual confounding 
methods, including split-plots, are discussed in great detail. One complete chapter is 








668 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


devoted to covariance. Many problem-exercises are presented for each type of design, 
plus extensive references to experiments and analytical procedures. Unfortunately 
few, if any, examples are taken from the engineering and physical sciences, which 
are becoming a most fertile field for the use of experimental designs. Presumably 
for this reason, two important aspects of design construction are mentioned in a very 
cursory manner: fractional replications and response surfaces. Procedures of testing 
for main effects and interactions are presented, but the author fails to discuss methods 
of evaluating various factors when interactions are present. 

There is extensive introductory material on: (i) tests of significance for a group 
of ranked means; (ii) transformation of data; (iii) tests to determine when the usual 
assumptions behind the analysis of variance are not satisfied; (iv) principles of deter- 
mining the size and shape of experimental units; (v) the number of replications re- 
quired to attain desired results. Unfortunately, the author presents a large number of 
methods under (i) without critically evaluating them; this leaves the reader in a 
state of confusion regarding their relative merits. For example, it is not brought out 
that the maximum error rate for the Duncan multiple range test in asserting the in- 
equality of two of p truly equal means is ay =1—(1—a)?", whereas for some tests 
ap is constant for all p. Transformations are often useful in stabilizing variances; how- 
ever, the author fails to tell the reader what to do when a transformation will not cor- 
rect the difficulty. Other texts mention possible procedures, as does Cochran in his 
Biometrics (1947) article referred to in the book. 

A substantial amount of space is devoted to the least squares and variance com- 
ponent theory applicable to each type of design. This material is well written and 
should be invaluable to those readers interested in the theoretical foundations of es- 
timating and testing procedures. Also the author wisely presents these sections in 
such a manner that the reader not interested in theory can omit them and still un- 
derstand the analytical procedures. Scant attention is paid to general computing 
procedures for multiple regressions and for multiple classification data with unequal 
subclass numbers; the reader is sometimes referred to other sources for the details, 
but in many cases he is left to search for himself. The author does not include in his 
regression models terms involving interactions between variates from both finite and 
infinite populations; hence, he avoids the current controversy regarding the expec- 
tations of mean squares in these cases. Similarly he avoids all the difficulties of vari- 
ance component analysis with multiple classification data with unequal subclass num- 
bers (both for mixed and random models). 

The reviewer disagrees with the author’s insistence that, “Analysis of variance 
techniques is suitable only for experiments in which use has been made of the elements 
of chance in allotting the treatments to the experimental units” (italics by the re- 
viewer). Admittedly Federer’s point of view seems to be supported by many statis- 
ticians. It would be understandable if the results were based on randomization 
theory; however, there is nothing in the theory of least squares which requires random 
allotment. Randomization is only one device used to minimize aberrations from the 
assumption of uncorrelated errors. In many instances nature or the production 
process is an excellent randomizer. Randomization is insurance against conscious 
and possible subconscious biases in allocating treatments; however, one should 
always compare the cost of insurance with the magnitude of the possible calamity 
being protected against. In particular, it seerms unreasonable to reject balancing 
procedures categorically as the author does, especially since covariance usually can 
be used to obtain an unbiased estimate of the error variance. 

The author extols the virtues of the latin square design without warning the reader 
of the unfortunate results one can obtain if real interactions exist among row, column, 





BOOK REVIEWS 669 


and treatment effects. This latter difficulty is especially important when the rows or 
columns are also treatments; in this case, one is aire a fractional factorial design 
with a highly undesirable alias pattern. 

The reviewer is skeptical of the recommendation that users of incomplete block 
designs may ignore the block restrictions in the analysis if the interblock mean 
square (E,) is less than the intra-block mean square (Z,). Federer cautions against 
recovering inter-block information with less than 12 degrees of freedom (f,) for Ey 
and then proceeds to present computing details on p. 268 and in Example XI-1 when 
fs is 4, without warning the reader. 

One continues to wonder why writers of textbooks on experimental designs discuss 
the nonbalanced designs first and then, seemingly as an afterthought, bring up 
balanced designs. Why not start with balanced designs and then present the general 
principles of partially balanced designs with illustrations of the simpler ones, such 
as the square lattices? 

This reviewer’s over-all evaluation of the book is that Federer has tried to cover 
too many topics. The table of contents would lead one to believe that this is a much 
more comprehensive book than any of its competitors, yet it lacks many of the details 
needed by the user of statistical methods. For example, the author tries to cover the 
complicated subject of long-term experiments in eight pages. In addition, how can 
an experimenter set up a confounded factorial experiment or use an incomplete blocks 
design without a set of plans? Yet the only design plans included in this book are those 
illustrated in examples. One finds it difficult to recommend to experimenters a book 
which retails for $11, and which still must be supplemented by another book contain- 
ing such plans. However, if an experimenter desires another reference on experimental 
designs in addition to that containing ‘is design plans, Federer’s book is strongly 
recommended. It certainly deserves to be on every statistician’s desk; even in those 
cases where the discussion is too brief, the bibliography is excellent. 


Handbook of Industrial Statistics. A. H. Bowker and G. J. Lieberman. New York: Pren- 
tice-Hall, Inc., 1955. Pp. xii, 774-958; viii. $5.00. 


Mitton E. Terry, Bell Telephone Laboratories, Inc. 


His book is a reprint of the section on industrial statistics in the Handbook of 

Industrial Engineering and Management. It comprises seven articles entitled as 
follows: 1, Basic Statistical Concepts; 2, Statistical Quality Control; 3, Sampling 
Inspection; 4, Common Significance Tests; 5, Curve Fitting; 6, Analysis of Variance; 
and 7, Analysis of Enumeration Data. Articles 3, 4, and 6 comprise 132 of the 184 
pages of text material and will be reviewed first. 

Article 3, Sampling Inspection, is a comprehensive handbook discussion of all 
the important sampling plans presently available to the statistician. While terse, it 
is clear, unambiguous and altogether an excellent summary reference source. 

Article 4, Common Significance Tests, develops first the concept of the power of 
the test as an OC curve. Then, the one-sided and two-sided U and ¢ tests for single 
and double samples, the chi-square test of the hypothesis that the variance of the 
population from which the sample has been drawn is equal to some specified value, 
and the F test for the hypothesis of the equality of variances, are presented together 
with their OC curves. The last half-page of this article is devoted to problems of 
estimation with emphasis on confidence limits. This article depends only on the 
assumption of normality and no mention is made of the assumptions of randomness, 
independence, or homoscedasticity. The concept of transforming the variates either to 
stabilize the variance or to approximate a normal distribution is not introduced. 





670 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


Article 6, Analysis of Variance, includes the analysis of variance for balanced one, 
two, and three-way classifications; an interaction model for two and three-way 
classification experiments; the analysis of the latin square; a brief presentation of 
a components of variance model; a terse statement of Hartley’s test for homogeneity 
of variances; the F test for homogeneity of treatment means; and both Scheffé’s and 
Tukey’s tests for comparison of treatment means. The analysis of each design is 
presented together with a completely worked example. In this article the authors do 
refer to the assumptions of independence and randomness of the underlying homo- 
scedastic error distributions. There is, however, no discussion of the consequences 
of failing to meet any of these assumptions or of methods for analyzing data in which 
one or more of these assumptions fail. The discussion of the components of variance 
model is unsatisfactory even at an elementary level since there is the plain implication 
that the only possible model is the completely random one. The question of appro- 
priate error terms when interactions are not negligible is left almost untouched. 
Nothing is said about subsequent steps when the null hypothesis of homogeneity of 
variances is rejected. In summary, this article gives techniques concerning, but little 
insight into, the analysis of experiments. : 

Article 5, Curve Fitting, deals mainly with the least squares method of fitting 
a straight line. This part of the article is quite complete and comprehensive, discussing 
extrapolation, interpolation, confidence limits and inverse regression where the 
independent variable is represented by a set of fixed numbers. There is a misstatement 
on p. 890: “a sufficient condition for the existence of a linear relationship is that both 
x and y have a bivariate normal distribution”; but a reference at the end of this 
sentence leads to a footnote which is correct. In the formulation of inverse inter- 
polation, p. 894, the authors fail to make the qualifying condition that the quantity 
c shall be non-negative. For if this quantity be negative it is possible to have complex 
limits. The latter part of this article looks briefly at correlation and multiple regres- 
sion. The treatment of correlation consists of stating the definition of the Pearsonian 
sample correlation coefficient and the statement that in engineering applications the 
correlation coefficient does not play a very important role. But the implication that 
correlation is of no engineering interest is unfortunate where the authors have 
included in Article 7 (analysis of enumeration data) chi-square tests which are 
appropriate to engineering studies of correlation. 

The first article (Basic Statistical Concepts) considers briefly empirical distribu- 
tions and histograms and two theoretical distributions; namely, the normal distribu- 
tion and the binomial distribution together with their first and second moments. In 
this short article there are certain statements which are misleading, ambiguous, or 
false. For example, in the introductory chapter the authors state, “the science of 
statistics deals with drawing conclusions from observed data,” and later on in the 
same section make the following statement: “Ultimately, these observations are to 
be used for making decisions. The remainder of this section (i.e., this text) deals with 
providing procedures for making decisions with preassigned risks on the basis of the 
limited information in samples.” Clearly the authors believe that decision making 
is equivalent to conclusion drawing. Further, on p. 779 the statement, “To make 
valid inferences on the basis of small samples it is necessary to make some assump- 
tions about the form of this [underlying] population” apparently excludes non- 
parametric inference. In the sample chosen to illustrate the use of histograms the au- 
thors have unfortunately chosen an example from life testing wherein they fit a 
normal distribution. Further, the authors state “One of the classical theorems of 
probability states in essence that if observed quantities can be considered to be the 
result of large numbers of additive chance effects the distribution of these quantities 











BOOK REVIEWS 671 


should be approximately normal. This theory plus a mass of empirical evidence 
indicates that the normal distribution may be assumed as the underlying population 
for a large number of industrial problems.” This development may very well tend 
to lead the student reading this text to believe that most life test data can be assumed 
to follow the normal distribution, which is most certainly not the case. In addition, 
many other engineering problems are of such a character that the normal distribution 
is quite unsatisfactory as a model on which to base statistical inferences. 

Article 2, on Control Charts, can only be described as inadequate. The authors 
confuse design specifications and the relationship between such limits and process 
capability limits. They contend that, if the process cannot produce units 99.9 per 
cent of which conform to the specifications, then either the process must be changed 
or the specifications rewritten. This is not true. Many processes today are economi- 
cally feasible with a yield of only 50 per cent conforming to specification. Further 
the only statistics discussed in this section for quality control charts are 2, R, s, 
R, 5, p, c. No mention is made of the use of the median, mid-range or the median 
range. The setting up of control charts is predicated on the process being in control, 
and the authors seem to feel that the main feature of a contro) chart is the detection 
of outages. Consequently the importance of runs is overlooked completely. 

The tables and figures are numbered serially 13.1 through 13.54 and 13.46 respec- 
tively and are difficult to find. The nomogram, p. 958, is given without detailed in- 
structions although the text states that such instructions are on the nomogram. The 
references included in the text range, article by article, from excellent to unsatis- 
factory, Article 3 being excellent, Article 6 satisfactory, Article 7 fair, and the 
remainder unsatisfactory. All of the computing formulas and techniques discussed 
in this text are based on the availability of a hand calculator with no attention to 
the use of larger machines. 

With the exception of Article 3 on Sampling Inspection it is hard to imagine this 
text being considered as an up-to-date handbook of industrial statistics. 


Nellie Landblom’s Copybook for Beginners in Research Work. Nellie Thompson Land- 
blom. Fort Collins, Colorado: Multigraph Service Bureau, Colorado A. & M. College, 
1955. Pp. ii, 118. $2.95. Paper. 


A. E. Sarwan, University of North Carolina 


HE author stated in the preface that the copybook was prepared in response to 
the requests made by graduate students upon leaving college to have available a 
set of model calculations. 

“A preliminary knowledge of the fundamental principles of statistics is assumed 
and is necessary if the student is to know which model to use and how to interpret 
his results.” 

The copybook consists of two parts. Part one contains the analysis of some ex- 
perimental designs as well as intraclass correlation, paired data and group compari- 
sons. 

The data in this first part are purely hypothetical and, as the author stated, 
“cannot be used as such for any experiment.” This part is written to serve only one 
purpose, viz., to show method of procedure. The author explains, “In a hypothetical 
study such as this one, the results may seem absurd. However, the whole objective 
is computational procedure, which purpose is accomplished no matter how farfetched 
the results may appear to be here.” 

In general then, this first part is a collection of some numbers supposed to form 
different designs and the arithmetic ‘calculations are given without any explanation. 
No formulas or models are given in any stage of any problem considered. This leaves 








AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


the reader to guess the general form of the calculation and the formulas to be applied 
when similar numbers (which are present) in the data cause confusion. 

It seems to this reviewer that if this part is written for those who know the formu- 
las, then they will gain nothing. If the procedures are intended for those who are not 
familiar with the design and assumptions, not only will it be difficult to follow but 
also dangerous because of probability of misuse. The use seems to be limited to the 
author’s students and its value in this connection is doubtful. 

Statements of procedure are brief and not at all self-explanatory. 

Components of variance are given only for the fifth problem and not included any- 
where else. The author states “they are of very little use in usual practical experi- 
ments.” 

The efficiencies of different designs are not discussed, and several important designs 
and principles are neither included nor mentioned (e.g., incomplete blocks, con- 
founding). 

Part II is a reproduction of a paper which resulted from an actual experiment con- 
ducted by graduate students. In this part, “the details of operation are not clearly 
outlined as in the material of Part I, the assumption being that those wishing to use 
this type of material are more familiar with statistical methods than those who may 
use the first part.” In this part, simultaneous equations, variance analysis, deviation 
from linearity, actuul and expected means, interaction, tentative values of constants 
are included. . 

The main purpose of this paper (Part II) “is to clarify the procedures used in the 
fitting of constants where the equations are not all independent.” 

This part of the copybook is not a textbook in statistics or statistical design. It is 
simply the reproduction of a study with some statistical procedures involved, not 
written in the format of a book. Aithough it was prepared to teach calculation 
procedures and the details thereof; the author admits there are discrepancies in 
computation which were not worth the effort of recomputation (pp. 109-110)! 

The methods used in the simultaneous equations (when given) are neither the 
most common nor most recent. 

The copybook on the whole is not successful in introducing statistical procedures 
involved for beginners in research work and omits many topics which are greatly in 
need by research workers. 

The mechanical reproduction of the book is unclear. In many places the print is 
faint and could be read only with difficulty and checking. 


An Introduction to British Economic Statistics. Ely Devons. New York: Cambridge 
University Press, 1956. Pp. viii, 255. $4.00. 


Tor HuttGren, National Bureau of Economic Research, Inc. 


HIS volume is designed for first or second year students of economics and com- 

merce and for businessmen, politicians, journalists, and others who may have no 
previous acquaintance with statistical source materials and the complexities that 
surround them but nevertheless have occasion to use figures from them. The author 
sets himself three major objectives: to tell where “the main British economic statis- 
tics .. . can be found”; to explain “what they mean, what problems arise in using 
and interpreting them”; and to give guidance toward further reading and study. 
Subjects dealt with are population, manpower, industrial production (including min- 
ing, construction and public utilities), agriculture, distribution, transport, foreign 
trade, prices, wage rates, incomes, and national product. Public finance is left out 





BOOK REVIEWS 673 


because of “peculiar difficulty and intricacy”; in fact, banking and private finance 
are likewise omitted. 

The listing of sources is reasonably thorough and the description of each usually 
conveys a pretty good idea of the kind of detail to be found in it. The author does not, 
however, give readers any hint of the commodity and operating detail in the railway 
statistics. Titles are spelled out, and where the dates of survey and publication are 
irregular, as in the case of the Census of Production, they are specified. Less obvious 
sources, such as the Import Duties Act Inquiries, for production, are mentioned as 
well as the regular ones. Statistics of private origin such as those of the British Iron 
and Steel Federation, are noted, and so are convenient nongovernmental assemblages 
of figures such as those for roads and road vehicles. The emphasis is on recent statis- 
tics but leads to the past are also provided. 

Among the many problems of meaning and interpretation the author considers 
are the following. Scotland or Northern Ireland is sometimes included ina particular 
set of figures, sometimes not. Population data may or may not include foreign visitors, 
British armed forces abroad, allied armed forces in Britain. In a particular locality 
the daytime may be quite different from the nighttime, or the off-season from the 
on-season, population. Death rates should for some purposes be standardized by age 
and occupation. Birth rates should be stated in terms of female population in the vari- 
ous child-bearing age-groups to gauge the long-run outlook. There are more people 
“not at work” than “unemployed.” Sales of an industry, sales of its characteristic 
products by that industry, and sales of its characteristic products by all industries 
can be three different things. The “net” product of a manufacturing establishment 
does not include all products or services received from other enterprises. Britain’s 
one Census of Distribution had considerable failure of response. The months in 
which imports and exports are recorded is the month in which the Statistical Office 
receives shippers’ declarations from the customs officers, not the month in which 
the goods enter or leave, still less the month in which prices were agreed upon and 
commitments made. The balance of commodity trade is a poor indicator of the bal- 
ance of payments on current account. One section of the balance on capital account 
is “forced” to produce an over-all balance in spite of statistical discrepancies. The 
lower income brackets include beginners, part-time workers, and retired people. 
Internal Revenue figures for entrepreneurs pertain to a period a year earlier than 
those for other kinds of income recipients. 

In the limited space available for the topic of national income and expenditure, 
the author gives a surprisingly lucid account of such matters as the payments versus 
the product approach, gross versus net product, the inclusion of transfer payments 
for some purposes but not others, tax accruals versus tax payments, measurement 
of government product by wages and salaries, treatment of housing but not other 
consumer durables or inventories as savings, and the equality of savings and invest- 
ment. He shows a proper distrust of capital consumption estimates. 

Devons fulfills his promise of guidance toward further study. He gives numerous 
and specific references to official and unofficial explanations and criticisms of the 
figures and of the concepts such as adjusted birth and death rates. 

For readers versed in American statistics, perusal of this book will emphasize the 
broad similarity between industrially advanced countries with respect to the kinds 
of data considered desirable. They will be reminded, too, that Britain has been 
somewhat behind the United States. The British did not have a Standard Industrial 
Classification until 1948, a Census of Distribution until 1950. Even in lags there are 
similarities. Our official national income estimators have only recently told us much 





674 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


about what goes on in their work. At the moment such explanations are promised in 
Britain. 

The descriptive passages inevitably make dull reading unless one has an immediate 
practical interest. But the interpretative and analytical passages are written in the 
simplest language compatible with the subject. Felicitous expressions occur. “The 
mere fact that they [measures of comparative efficiency] would be useful should not 
tempt us to assume that they are possible.” “Not only has one to be a specialist in 
the subject, but one has also to be in the confidence of the authors” of balance of 
payment estimates. On the other hand, certain problems of index-number making 
are discussed repetitively in different contexts; an early general section on problems 
of aggregation might have served better. And the author might well have said more 
about the uses of statistics. 

Many of the problems the author discusses are discussed more fully in standard 
textbooks of population, shipping, etc. Many of the distinctions he describes become 
clear after a little poring over the basic sources. But the book ought to save beginners 
from some loss of time, some confusion, and occasional downright error. The reader 
is given good general advice: for example, it pays to examine the form on which the 
basic information is obtained. To anyone obliged to work at a distance from an ade- 
quate collection of British material the volume should be especially valuable. The 
author is thoroughly acquainted with his broad subject. Anyone who prospects the 
sources in vain for some item will be reassured to learn that Devons says it does not 
exist. 


Industrial Censuses in the United States. Technical Assistance Mission No. 77, Oswald 
George, Chairman. Paris: Organisation for European Economic Cooperation, 1955. Pp. 143. 


$1.50. Paper. 
Owen C. Gretton, Bureau of the Census 


HE report of the Technical Assistance Mission No. 77 on Industrial Censuses in 

the United States constitutes a significant contribution to the meager fund of 
literature in that important field. In this carefully written and thorough document, 
the authors outline the details of form preparation, decisions as to form content, 
field operations, use of sampling, processing of the returns, publication standards, 
and the myriad other facets of industrial census-taking in this country. The Assistance 
Mission publication also effectively brings into the picture the role of the inter- 
censal Annual Survey of Manufactures and the current commodity reports program 
of the Bureau of the Census. The manner in which the industrial censuses are respon- 
sive to the statistical requirements of compilers of basic economic indicators, such 
as national income, physical production and productivity indexes, and input-output 
studies is perceptively described. 

Industrial Censuses in the Uniied States does over-emphasize the impact of the 
Empleyment Act of 1946 on the statistical program of the Federal Government. 
The members of the Mission miscalculated the effect that Act would have on Con- 
gressional appropriations for statistics. It is true that, within the past year or two, 
the intensive utilization of statisiival series by the Council of Economic Advisors 
and the support of that body for improved and expanded data sources have given 
a significant i:npetus to Federal statistical programs. In all likelihood, however, the 
growth in availability of statistical information will come at a much slower pace 
than was anticipated in the report of the Mission. 

The report of the group accurately forecast the extensive use of Univac in the 
Economic Censuses covering 1954 and the utilization of administrative records for 





BOOK REVIEWS 675 


establishments without employees in the Census of Business. The 1954 Censuses of 
Manufactures and Business were almost completely processed, including the editing 
phase, on two Univacs at the Census Bureau, together with supplemental Univac 
time acquired on machines at other locations. At some time in the near future, portions 
of the book, and particularly those dealing with processing, could be effectively 
revised to describe the experiences of the 1954 Censuses. In the Business Census, 
the high-speed printer has been combined with the Univac operation to produce 
printer’s copy of the final tables, doing away with the traditional typing and set-up 
operations. 

Very significant progress has been made in the use of shuttle forms in the manu- 
facturing area since the visit of the Mission to this country. The advantages of that 
form, which is referred to on page 22 of Industrial Censuses in the United Staies, are 
being realized most dramatically in the current Annual Survey of Manufactures 
covering 1955 activity. The new Annual Survey report form calls for five years of 
information (1954-1958) on a single piece of paper and there is already substantial 
evidence that this technique will materially reduce respondent errors and processing 
costs. Data from the manufacturers’ 1954 Census of Manufactures reports (as 
edited and tabulated in the Census) were entered on the form by Census for most 
items before mailing the report on which the respondent was to enter 1955 informa- 
tion. 

In general, the document contains accurate and penetrating statements regarding 
the respective roles of trade associations and government agencies in the collection 
of current commodity data. It is 'a error, however, in referring to a tendency of the 
Federal Government in expanding its statistical program to duplicate some of the 
existing trade association data series. Actually the steps taken by the government, 
including the Census Bureau which is the principal agency in this field, have been 
exactly in the opposite direction. Within the past five years, the Census has discon- 
tinued the monthly collection of data on a number of products, such as softwood - 
plywood, cast iron boilers, and other types of heating and cooking equipment in the 
light of the availability of trade association data in these fields. The Bureau does 
conduct annual surveys in such areas in order to provide a benchmark for the some- 
what less than complete monthly industry data and to afford an official set of statis- 
ties for various purposes. It makes certain that the association’s current data are made 
publicly available to all data users and, in fact, frequently republishes the statistics 
in its own “Facts for Industry” publications for related products. 

The statement on p. 62 that “There is no separation [in the Census of Manufac- 
tures], however, of materials used or value added in such building [major additions 
or alterations to plant by the company’s own employees]” is inaccurate. The manu- 
facturer is instructed to omit materials used in capitalized construction from his cost 
of materials and supplies figure and include such materials only in the capital expen- 
diture inquiry. The wages of force-account construction workers are included in the 
payroll figures but this does not affect value added which is derived by subtracting 
the cost of materials, fuels, and contract work from the value of products shipped. 

In summary, Industrial Censuses in the United States was a book that needed to be 
written, not only for use in other countries but in the United States as well. It is an 
informative document and reading it leaves the technician in the field with a feeling 
that more on the subject would be quite welcome. A number of portions of the book 
dealing with particular aspects of industrial census-taking could well be expanded to 
individual papers or even chapters in a volume on the same subject to the advantage 
of students and data compilers in the general area. 





676 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


Interindustry Economic Studies. A Comprehensive Bibliography on Interindustry Re- 
search. Vera Riley and Robert L. Allen. Baltimore: Johns Hopkins Press, 1955. Pp. 280. 
$2.50. Paper. 

NTERINDUSTRY economic analysis, according to the foreword, has been “developed 
I and attained permanent status as a highly significant and valuable analytical tool 
during the past twenty years. ... Only recently has the separate development of 
theoretical work, collection of data, and computational capabilities reached the stage 
where application to research and policy problems has been possible.” 

Part I gives a bit of history and discussion of interindustry economics and lists 
eleven standard references. Parts II-XI are a bibliography of the field by subject 
matter: Part I11—Theoretical Structure of an Economy; Part I1I—Mathematical 
and Computational Techniques; Part 1V——Classification Systems and Problems; 
Part V—The United States National Structure; Part VI—Manpower Studies; Part 
ViI—Dynamic Analysis; Part VIII—Regional Analysis; Part [IX—National Studies 
other than U. 8.; Part X—Applications and Uses; Part XI—Appraisal. 

All of the literature, published and unpublished, foreign and domestic, has been 
surveyed according to the editors, one a staff bibliographer at the Operations Re- 
search Office at Johns Hopkins and the other an economic consultant. They found 
their material in universities, private research institutions, and government agencies 
throughout the world and attempted to include all references up to March 1955. 
This adds up to over 1200 items from more than 375 authors. Extensive cross-refer- 
encing will be particularly helpful to those not familiar with interindustry literature. 

D. D. F. 


Methods of National Income Estimation, Studies in Method, Series F, No. 8. New York: 
Department of Economic and Social Affairs, Statistical Office of the United Nations, 
January, 1955. Pp. 58. Paper. 50 cents. 


Epwarp C. Bupp, Yale University 


wo previous publications of the Statistical Office of the United Nations, A System 
Te National Accounts and Supporting Tables, and Concepts and Definitions of 
Capital Formation: present the agency’s proposals for a standard system of national 
accounts, together with suggestions for the uniform definitions of various concepts 
entering into the accounting framework. The present pamphlet discusses problems in 
the estimation of national income statistics within the framework of concepts pre- 
sented in the earlier reports. 

Proposals for the employment of similar source materials and the establishment of 
uniform estimating methods by the various nations would obviously be inappro- 
priate, and the report relies instead on a comparative description of the sources and 
methods now being used, although selecting and emphasizing those that appear to be 
most appropriate or of most general interest to the countries concerned. The first 
chapter is concerned with estimates from the income side by distributive shares, or 
“incomes accruing to the factors of production”; estimates from the product or ex- 
penditure side are discussed in the second chapter. A final chapter is devoted to the 
estimation of the industrial origin of income (the income concept employed is one 
of gross—rather than net—domestic product at factor cost), most of the discussion 
centering around problems of estimating agricultural income (farming, forestry, 
hunting and fishing), as representative presumably of problems to be found in other 
industries as well. 





1 Studies in Method, Series F, Nos. 2 and 3, 1953. 





BOOK REVIEWS 677 


While more than forty countries are mentioned and particular estimating proce- 
dures of over thirty of these are referred to briefly, the estimates of a relatively small 
number of countries serve as the basis for most of the illustrative material. As might 
be expected, United States and Canadian methods receive almost as much attention 
as all the others combined. 

Explanations of sources and methods of income estimation are of use to two groups: 
users of the data who are anxious to form some opinion as to the reliability of the 
statistics, and national income estimators, who may profit from the experience of 
other estimators. The former group may well be discouraged by the impression one 
receives, from studies of this character, of the tenuous basis of many of the estimates. 
The latter group will undoubtedly find the discussion of various methods too brief 
for their purposes, although interesting leads may be provided. Both groups will find 
it necessary to refer to the basic sources (such as the United States National Income 
Supplement), a list of which is provided at the end of the report. Nevertheless, the 
study provides a useful introduction to this topic, and serves to emphasize the im- 
portance of developing adequate estimating procedures and improving underlying 
statistical sources by countries preparing national income estimates. Differences in 
sources and methods are probably a far more important factor producing a lack of 
comparability among the income estimates of different nations than are differences in 
concepts and definitions. Indeed, the choice among concepts is more often than not 
dependent on questions of “feasibility” and “practicality,” mere synonyms for the 
problems of sources and methods. 


A Study of Saving in the United States. Raymond W. Goldsmith. Vol. I, Introduction: 
Tables of Annual Estimates of Saving, 1897-1949, pp. xxx, 1138; Vol. II, Nature and 
Derivation of Annual Estimates of Saving, 1897-1949, pp. xxiv, 632; $30 for two volumes. 
Vol. III, Special Studies. Raymond W. Goldsmith, Dorothy S. Brady, and Horst Menders- 
hausen, pp. xix, 476, $8.50. Princeton: Princeton University Press, Vols. I and II, 1955; 
Vol. III, 1956. 


GrorGs GarRvy, Federal Reserve Bank of New York 


HE saving process is at the very heart of economic growth and change, yet the 

Study of Saving is the first attempt to measure, and to a certain extent, to analyze, 
the savings process as a whole. In contrast to several recent studies which focus on 
personal saving (such as Individuals’ Saving by I. Friend and V. Natrella), the Gold- 
smith study encompasses business and government saving as well. In undertaking 
the project, which was sponsored by the Life Insurance Association of America, 
Goldsmith was assisted by a committee comprising several distinguished academic 
and business economists. 

Any contribution to economic knowledge which is presented as a dartboard for 
students to practice their critical skills (author’s preface, p. xi) and to improve the 
game should be a challenge. But when the dartboard consists of nearly 1,000 tables 
cemented by more than 1,200 pages of text, the task of the reviewer who is allotted a 
few pages becomes almost impossible; I shall, therefore, concentrate on the statistical 
aspects of the Study of Saving, in particular as a source of new statistical time series. 

The underlying saving concept is that of the change in earned net worth. The 
annual estimates of the components of saving for the years 1897 to 1949 in Volume I 
are the statistical core of the study. Footnotes appended to each table make it possible 
for the user to trace each estimate to the primary sources (and to explain the adjust- 
ments made) without reference to the text. The theory and praxis of measuring sav- 
ing within a system of social accounting are discussed in Volume II. The discussion 





678 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


of concepts deals with such perennial problems as whether capital gains are saving as 
well as with questions which so far have received relatively little attention in litera- 
ture, such as the treatment of the costs of distribution of saving. The exposition of 
the essential features of the estimating procedures is followed by a discussion of the 
accuracy of the annual estimates. Estimates for each segment are compared with 
relevant alternative estimates, even when conceptual differences are quite large and 
when such estimates cover a shorter period. The rest of Volume II is devoted to the 
description of derivation of each of the components of nonfarm individuals’ saving 
and of the other major saver groups. The derivation of the extensions to 1897 of the 
Department of Commerce estimates of national product and of national and personal 
income, used in the analytical parts of the study to relate saving to significant eco- 
nomic magnitudes, is described in Volume III. Another part of the same volume pre- 
sents the national balance sheets and wealth statements which provide the analytical 
framework and statistical benchmarks for the time series presented in Volume I, and a 
balance sheet for households constructed on the basis of data from the 1950 Survey of 
Consumer Finances. 

The author stresses more than once that the analytical contribution of the Study of 
Saving (contained in Volume I) falls far short of the potentialities of the statistical 
material developed and of the original plans as well. The amount of resources that 
had to be devoted to the estimation and reconciliation of the basic series exceeded 
expectations, with the result that the original plans for analysis had to be curtailed 
drastically. A short summary of the findings, as well as a succinct discussion of the 
problems of measuring saving, may be found in Chap. I of the “Introduction”—the 
modest term used by Goldsmith for his analytical contribution. In the subsequent 
chapters, trends and cyclical fluctuations in saving are studied. Trends are fitted to 
various alternative definitions of national and personal savings for five periods (in 
some cases, using several alternative mathematical formulas), a variety of saving 
ratios is analyzed and changes in the shares of the various groups of savers in national 
saving and in the distribution of personal saving among forms of saving are studied 
on the basis of period averages. A final chapter is devoted to the relationship of the 
saving series to national wealth and to the national balance sheet estimates presented 
in the author’s earlier publicatiors.! Econometric experimentations with the savings 
function (in Volume III) complement the analytical part. The bibliography deserves 
special mention: at the end of each volume, all sources quoted are alphabetically 
listed under two headings, non-U. 8. Government and U. 8. Government publications, 
and the pages and tables in which the particular source is mentioned are given. 

This brief summary does not do justice to the important contributions of the two 
co-authors of Volume III, who are responsible for important monographs on two re- 
lated, but independent, bodies of saving data: Dorothy S. Brady’s study of family 
saving based on budget data reaching back to 1888-90 and probing into the deter- 
minants of saving behavior, and Horst Mendershausen’s substantial and, in many 
respects, pioneering study of the patterns of estate wealth. 

The main objective of the Study of Saving was to place in the hands of the research 
analyst a consistent, interlocking, and detailed set of annual saving estimates. The 
problems of arriving at a conceptually consistent and analytically meaningful de- 
termination of the various components of saving are explored more thoroughly than 
the determinants of saving behavior or the effects of Reteitiene ' in savings on the 
level and rate of growth of economic activity. Scieniens 





1 In Studies in Income and Wealth, Volumes XII and XIV and in Income and Wealth of the United States (Income 
and Wealth, Series II). 





BOOK REVIEWS 679 


Goldsmith has done as much as was feasible in order not to prejudice the choice 
of the user who looks to the Study of Saving primarily as a source of time series. The 
numerous variants offered, together with the detailed descriptions of the composition 
of aggregates make it possible to derive series fitting almost any conceivable analyt- 
ical need. Indeed, the “principle of reproductibility” has been rigorously adhered to 
throughout the study “by showing all the basic data from which the estimates were 
built up, or by indicating where they may be found, and by describing all manipula- 
tions of the data that were required to fit them into the structure used in the study.” 
Goldsmith is more concerned with clarifying the logical and statistical implications 
of the alternatives among which an analyst must decide in choosing an appropriate 
definition of saving than with the discussion of the respective merits of such decisions. 

National savings aggregates are developed for three accounting concepts. The 
social accounting concept in which depreciation is valued at replacement cost and 
all capital gains and losses (including inventory profits and losses) are excluded; the 
business accounting concept, in which depreciation is based on original cost, and 
capital gains and losses as well as depletion allowances are included; and a cash 
concept, which ignores capital consumption and other accruals. The analytical part 
of the Study of Saving is based on the first concept while the last concept will be of 
interest primarily to those interested in the money flows analysis. The business 
accounting concept corresponds closely to prevailing business practices and is, there- 
fore, significant for motivational analysis. For each concept, alternative estimates 
excluding consumer durables are also given. An additional, broader variant, which 
includes military durables, soil improvement, and some minor adjustments is pro- 
vided for the social accounting concept. Estimates based on the social accounting 
concept are also presented in constant (1929) prices. Three alternative deflators are 
used, and data deflated by the general price level are given also, divided by the num- 
ber of inhabitants, households, consuming units, and persons in the labor force. 

The annual savings data are usually derived from balance sheet estimates; within 
a system of social accounts, identical estimates are obtainable from the income 
account. For components for which balance sheets could be constructed for selected 
years only, annual data are derived by assuming regular increments between these 
benchmarks. Where balance sheet data are not available at all, the income, and, in 
some cases, the commodity flow approach, is used. Thus, Shaw’s (and, for more 
recent years, Department of Commerce) flow data are used to estimate investment 
in consumer durables (because of the lack of appropriate balance sheets for con- 
sumers) and to apportion capital formation between the corporate, unincorporated, 
and nonprofit sectors. 

Goldsmith dev»tes a whole chapter to the question of the accuracy of his annual 
estimates. One pertinent summary paragraph reads as follows: 

Evaluation of the possible errors in the individual series from which the estimates of 
group and national saving have been constructed indicates that the margin of error 
is hardly under 10 per cent for any given year or for the average annual figure in any 
series, that it is probably in the order of magnitude of 20 to 30 per cent in many of 
them, that it may run even higher in not a few cases, but that the relative margin 
of error in most cases is reduced for sequences of several years and generally the 
smaller the longer the period. This is due partly to the existence of periodic bench- 
mark data which prevent errors from accumulating for more than the years between 


benchmarks, generally five or ten years; and partly to erratic fluctuations, many of 
which disappear when groups of several years are combined. (1:40) 


Annual data are supplemented by a set of averages for eight economic periods 
(generally including one or more full business cycles), for quadrennial periods, sud 
for the “normal period” defined as the entire span covered by the study with the ex- 





680 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


clusion of the two world wars and the depression years 1930-33. Another set of tables 
shows the distribution of saving of the major saver groups by components. 

Seven main saver groups are distinguished, and estimates for seven analytically 
significant combinations of these groups are provided. Thus, saving of nonagricultural 
households becomes successively a component of individuals’ saving (when com- 
bined with agricultural saving), personal saving (when saving of unincorporated 
business is added), private, nonfederal, and finally national saving. 

The choices made in the sectoring of saver groups and in the detail provided on 
the forms of saving were determined not by logic alone, but also by the nature of 
source data. A great effort was made to make such choices consistent with the main 
analytical requirements. Yet, for many types of motivational analysis of household 
saving, for instance, refinements of the series on individuals’ saving will be required, 
since in its present form it includes saving of nonprofit institutions as well, and no 
distinction is made between saving from income received and from accrued income 
(such as earnings of pension reserves and undistributed income of personal trust 
funds). 

When the author’s description of his sources and methods reads like a nonending 
plea for better data, it becomes clear that his primary goal was to erect a logical 
system of empty statistical boxes and to fill it with the best available estimates, 
including estimates of his own where a complete statistical void was encountered. 
Goldsmith thoroughly discusses the limitations attached to his sources and to his 
own methods of filling in gaps and breaking up aggregates into components. As a 
matter of fact, much of Volume II reads like a running commentary on the short- 
comings of economic and financial statistics now available. Probably no author of 
a major contribution to our statistical knowledge and economic understanding was 
more aware of the limitations of his sources and the inadequacy of his own efforts, 
both of which are traceable to the dearth of primary data which becomes nearly 
absolute when data are extended back to 1897. 

Following the principle of the equalization of the marginal efficiency of the unit 
of effort, Goldsmith attempted to distribute the total resources of the study roughly 
in proportion to the size of components of saving. Where usable secondary sources 
were available, Goldsmith didn’t hesitate to draw upon them. Yet, in the author’s 
own words “only rarely does the Study of Saving use a statistical series as it stands as 
a final component in the measurement of saving or even as a major constituent of 
such a component. Usually, series of basic data have to be combined, split, or blown 
up or otherwise adjusted to fit into the system of social accounting which forms the 
framework of this study’s saving estimates.” (II: 231) 

The adaptation of the source material to the author’s social accounting framework 
was a formidable task. Even more ingenuity had to be dispiayed in providing original 
estimates, which, in some cases, are based on scraps of collateral evidence, on hunches 
or even on guesses (such as estimation of expenditures of nonprofit institutions on 
equipment as equal to 20 per cent of their construction expenditures). Indeed, Gold- 
smith refers to some of his own estimates as nothing more than “reconnaissance into 
an almost unexplored field” (II: 565) or an attempt “to cut a precarious path through 
the jungle.” (II: 357) Some readers of the Study of Saving will feel that less would be 
more. But in most cases less would not do, since the tenuous estimates fill in gaps, or 
break down aggregates, in order to build bridges without which the system of accounts 
would not balance out. 

Goldsmith’s original estimates range from such major contributions as over-all 
estimates of saving for Federal as well as state and local governments, of separate 
estimates for saving of agriculture and unincorporated business enterprises and non- 





BOOK REVIEWS 681 


farm saving through consumer durable goods, to the derivation of minor component 
series. Some of the most important new estimates developed in connecticn with the 
Study of Saving are extensions back to 1897 of the annual series on corporate cash, 
profits, and saving, on business inventories, new issues of corporate stock absorbing 
cash, and on residential mortagage debt. Other important contributions include en- 
largements of the series on individuals’ saving through securities and real estate by 
including brokers’ commissions and builders’ profits, respectively. It is interesting to 
speculate with what lag those improvements will be incorporated into the official 
estimates of personal and individuals’ saving. 

The statistical material of the Study of Saving must be evaluated in terms of its 
basic objective—to provide a sufficiently solid basis for studying trends and struc- 
tural changes. Perhaps Goldsmith should have emphasized that annual data for 
components are provided not as an incentive for studying year-to-year fluctuations, 
but to permit derivation of averages for periods different from those on which his own 
analysis is based. Indeed, the limitations of annual data derived by techniques in- 
volving generous use of constant blow-up factors, mark-ups, and ratios (mostly, 
but not exclusively, in the minor component series) are obvious, and the author per- 
sistently stresses this point. The reliability of the estimates decreases as they are 
pushed back of the 1929-33 period, where many of the basic data used by Goldsmith 
begin, and with the degree of disaggregation. In order to provide the same amount 
of detail for the entire period, the limits of statistical feasibility (generously inter- 
preted) were approached more than once. The development of a historical record of 
social accounts is a long process, and the reconstruction of even the npt-too-distant 
past will hardly ever progress beyond rough approximation. Definite possibilities 
for the improvement of detailed estimates exist, however, in many areas, and they 
are specifically discussed by Goldsmith with respect to basic data already available 
as well as for areas which call for additionai primary data. (II: 120ff.) 

In the main, Goldsmith’s findings on the behavior of saving and of its relationship 
to income and other relevant economic magnitudes over the longer period* confirm 
the conclusions familiar from the analysis of data covering shorter periods and deviat- 
ing from the social accounting concept. They confirm the cyclical variability of sav- 
ing and the relative stability of the personal saving ratio (if consumer durables are in- 
cluded). Except for the war years and the Great Depression, personal saving ac- 
counted for about three-fourths of the national saving. While national saving increased 
through the entire period covered by the study at an annual rate of approximately 
7 per cent, this rate is reduced to about 12 per cent when adjusted for the increase in 
price level and population growth. Personal saving averaged 13 per cent of personal 
income, but national saving was equal to only 9 per cent of the national income, 
mainly because of Federal government dissaving during the two world wars and 
the Great Depression. 

Secular shifts among saver groups were less important than changes among the 
forms of personal saving. The gradual relative decline of investment in real estate and 
agriculture and of holdings of bank deposits was accompanied by a rise in saving 
through life insurance and pension and retirement funds, both government and 
private, and through durables. Saving through corporate stocks and bonds lost in 
favor of government securities. Saving invested through intermediaries has increased 
at the expense of the saving lent directly to borrowers, but saving for self use was in 
1946-49 sbout the same as in 1897-1914. This process of the institutionalization of 





2 Some of the main findings were discussed by Goldsmith in ‘Trends and Structural Changes in Saving in the 
Twentieth Century,” in Savings in the Modern Economy, University of Minneapolis Press, 1953. 





682 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


saving will be the subject of another study by Goldsmith (Financial Intermediaries in 
the Saving and Investment Process in the American Economy) now in press. 

The textual part of the Study of Saving is a masterful combination of lucid exposi- 
tion, forceful argumentation, careful interpretation of results, and uncluttered atten- 
tion to detail. One of the expository difficulties of the Study is that in spite of the focus 
on the saving process as a whole, the discussion is frequently narrowed to household 
saving, as shown, for instance, by the following statement from the Introduction: 


To understand the saving process adequately we need to know, for the present as 
well as for a relevant stretch of the past, (a) which economic units save or dissave 
at different times, (b) how much they save or dissave, (c) in what forms their saving 
or dissaving occurs, (d) how saving is related to characteristics of the saving unit, 
such as its income, assets, age, and occupation, and to factors affecting the entire 
economy, such as movements in prices and interest rates, and (e) why different eco- 
nomic units save or dissave to the extent and in the forms they do. (1: 25) 


The factors enumerated under (d) are relevant primarily for saving by individuals, 
and additional or different determinants must be considered when analyzing corporate 
or government saving. 

There is a certain imbalance in the descriptive parts of the text, since the principle 
of apportioning the time available for the improvement of the basic time series 
roughly in proportion to their relative importance in aggregates has not been carried 
through in the textual description. Indeed, some parts of Volume II are research 
memoranda prepared in the process of the study, and no attempt was made to rewrite 
them to achieve uniformity in presentation and balance in space allocation. Yet, the 
lively tone of the discussion and the banning of the technical details to the footnotes 
to individual tables make even the purely descriptive parts eminently readable. 

Seen as an edifice in process of erection and to be improved upon by subsequent 
efforts (which are already under way, including extension of all basic series to more 
recent years), rather than as the presentation of a set of definitive estimates, the 
Study of Saving represents a major statistical contribution. The broad conclusions 
drawn from the analysis of the principal savings series—and Goldsmith did not push 
his analysis to greater detail—are unlikely to be modified by any subsequent revi- 
sions, refinements, and improvements of the underlying time series. 

No doubt, for many years to come, the basic statistical material of the Study of 
Saving will serve as raw material for numerous analytical studies. Indeed, the full 
significance of the Study will only be revealed by the use which subsequent analysts 
will make of its rich source material. Its statistical shortcomings are known to nobody 
better than to its author, who is fully aware of the fact that no amount of cross- 
checking can fill the gaps in basic sources. It is only to be hoped that his warnings and 
qualifications will be heeded by users. 


Flow of Funds in the United States, 1939-1953. Board of Governors of the Federal Reserve 
System. Washington D. C.: December 1955. Pp. 390. $2.75. Paper. 


Emmett J. Ricz, Cornell University 


ee is an important report. It presents and describes a new social accounting 
system applied to the American economy for the fifteen year period, 1939 to 1953. 
Although the results of the report will be of primary interest to economists, probably 
at first most particularly to students of money and banking, finance, and business 
fluctuations, the methodology employed (the concepts and statistical procedure) will 
certainly command the attention of social accountants and statisticians. 





BOOK REVIEWS 683 


The new system did not emerge full-blown from this investigation but owes its 
origin, and to a very great extent its development, to the path-breaking study con- 
ducted by Morris A. Copeland under the sponsorship of the National Bureau of 
Economic Research and with the cooperation of the Board of Governors of the Fed- 
eral Reserve System.! The research staff of the Board of Governors has carried on 
the development of flow-of-funds accounting to fruition in the present report. 

The principal objective of flow-of-funds accounting is “to provide a comprehensive 
and systematic economic record that will facilitate study of the interrelations among 
financial and nonfinancial processes.” The aim is thus to furnish a statistical frame- 
work for relating what is known about the production, consumption, and transfer 
of goods and services to corporate “financial” sources and uses of funds, changes in 
cash balances, government debt, consumer and farm credit, and bank and insurance 
company portfolios. Such a record enables us to trace fund flows through the entire 
(national) economy. 

The structure of the system’s accounts, their scope and integration, are condi- 
tioned by this and other analytic objectives. As fund flows are indicated in the 
accounts of various transactors—households, business firms, and governments, the 
economy may be divided into sectors consisting of economic units (transactors) 
grouped according to their dominant economic characteristics. Ten transactor groups 
(sectors), “similar with respect to function and institutional structure,” were found 
to be most useful: consumers, corporate business, nonfarm noncorporate busimess, 
farm business, federal government, state and local governments, banking system, 
insurance, other institutional investors, and the rest of the world. Conceptually, 
each of these sectors may be further subdivided into as many component groups as 
desired or practicable. (Indeed, the three financial sectors are built up by consolidat- 
ing accounts of significant subsectors.) 

Entries in the accounts of transactors include all transactions between separate 
economic units which are effected through exchanges of money and credit. This 
means, of course, that intra-firm or intra-household, barter and imputed transactions 
are excluded, the system making no attempt to account for them. Transactions in- 
cluded in flow-of-funds accounts are classified into 12 nonfinancial categories such 
as payroll, dividends, insurance premiums (and benefits), taxes (and tax refunds), 
other goods and services, etc.; and financial categories such as currency and deposits, 
bank loans other than mortgages, federal obligations, corporate securities, trade 
credit, etc. This procedure focuses attention on the interrelations between financial 
and nonfinancial processes. The transaction categories, defined in such a way that 
for all transactors the sum of all payments equals the sum of all receipts in each class, 
are applied consistently to the accounts of the several sectors so that flows of funds 
among them are classified in a broadly comparable pattern. The accounts for the 
economy as a whole are summarized in ten financial statements which take the form 
of sources and uses of funds for each of the ten sectors. Summary statements of 
sources and uses of funds for all sectors have been prepared on a comparable basis 
for each year from 1939 through 1953. (A summary statement is now available for 
1954 in the Federal Reserve Bulletin, October 1955.) The statements present a picture 
of the interrelations of all transactor groups in the economy for the year. Broadly 
they indicate what groups of transactors have paid, and what other groups have re- 
ceived, and how much, in connection with various types of transactions, e.g., payroll, 





} Morris A. Copeland, A Study of Moneyflows in the United States. New York: National Bureau of Economic 
Research, Inc., 1952, 





684 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


interest, rents and royalties, etc. The statements also show what groups of transactors 
owed and what others owned how much in the form of deposits and currency, notes, 
bonds, mortgages, trade credit, etc. Since the sources of funds for each transactor 
represent a use of funds by some other transactor the sector statements for the 
economy as a whole interlock and balance. Indeed, the sector statements may be 
thought of as “balance of payments” statements similar to those prepared to sum- 
marize international transactions. 

Flow-of-funds accounting differs from GNP accounting in ways too numerous and 
involved to detail here. However, it is important to note that the perspectives of the 
two systems are different. In the first place, GNP accounts measure flows of currently 
produced goods and services, while flow-of-funds accounts cover all transactions, in- 
cluding used real assets, securities, and transfer payments as well as GNP transac- 
tions involving the use of money and credit. Thus one feature of the flow-of-funds 
system is its broader transactions coverage. 

Perhaps something of the same notion is conveyed by saying that GNP focuses 
upon the income circuit and that the flow-of-funds approach focuses upon the money 
circuit. Although GNP accounts include various imputed items which are excluded 
from flow-of-funds accounts, GNP accounts do not reflect the roles played in the 
economy by money and credit, the banking system and other financial institutions, 
consumer and business assets and liabilities. The flow-of-funds system, therefore, 
presents the nation’s economic accounts in a financial framework, rather than one 
which depicts economic activity in terms of current output alone. 

The different perspectives of the two systems lead to different methods of sectoring 
the economy and this fact, in turn, militates against an easy translation of the most 
closely corresponding accounts and figures of one system into the other. Happily, 
the Flow of Funds report contains a number of “relationship tables” which indicate 
the steps necessary to relate certain GNP items, concepts, and series to those of 
flow-of-funds which are somewhat similar. For example, the personal income figure 
in the national income accounts may be adjusted by the addition and subtraction of 
a number of items to move to the figure for consumer nonfinancial sources of funds 
in flow-of-funds accounts. A similar procedure may translate personal consumption 
expenditures in the national income accounts into consumer nonfinancial uses of 
funds in flow-of-funds accounts. Unfortunately, the process of reconciliation is ardu- 
ous and possibly hazardous for those who are unfamiliar with the different bases on 
which the various series are compiled. In this connection, however, it is understood 
that work is proceeding on a technique which will effect a fuller reconciliation of 
related items in the two systems. 

The flow-of-funds record can be useful in a variety of ways. It may “contribute 
helpful perspective on current and prospective economic developments by providing 
a framework for integrating measures of income, consumption and capital expendi- 
tures, and borrowing and lending.” Moreover, such a vast collection of data systemat- 
ically organized in a comprehensive set of social accounts will surely provide the 
basis for formal inquiries into the nature of the interactions between financial and 
nonfinancial factors during economic changes. Because flow-of-funds accounts 
identify, to some extent, the economic groups which are borrowing and lending, they 
may be used along with other data to trace and evaluate changes in the structure of 
private and public debt. Hence, potentialities for use in dealing with debt manage- 
ment problems are apparent. 

Since flow-of-funds accounts and sector statements show how changes in financial 
flows ure related to changes in each sector’s balances of cash, trade credit, securities 





BOOK REVIEWS 685 


and other claims held and outstanding, new light may be thrown on the processes 
of cyclical expansion and contraction. For example, it may be possible to identify 
the transactor group (or groups) whose decisions, as reflected in their accounts, 
initiated the expansion or contraction. And if available for a long enough period, 
the record may be used “in drawing some tentative inferences with respect to cyclical 
and trend relationships among spending, saving, and financing.” Moreover, the rec- 
ord provides a factual basis and an analytical framework for evaluating the effects 
of monetary policy. When the accounts are presented on a more current (quarterly) 
basis it is probable that they will be helpful in making monetary policy decisions. 

Interest in the concepts and uses of flow-of-funds accounts is spreading rapidly. 
In November 1955, the Canadian Dominion Bureau of Statistics disclosed that several 
government agencies were collaborating in an exploratory research project to relate 
money flows to the “National Accounts.” A study of the post-war capital markets now 
in process under the sponsorship of the National Bureau of Economic Research will 
use its own specially developed flow-of-funds statements as basic data, and a doctoral 
thesis nearing completion at Cornell University will push back to 1932 the flow-of- 
funds estimates for the nonfinancial corporate sector. 

Although their nature cannot be predicted precisely, flow-of-funds accounting will 
eventually have repercussions on the teaching of money and banking. Undoubtedly 
it will be used along with other devices for showing the relationships between changes 
in money and credit and the functioning of the economy as a whole. 

The Flow of Funds report not only contains a wealth of statistical material organ- 
ized in a manner heretofore unavailable; the new accounting system which it develops 
and explains also represents a major step forward both in the presentation of economic 
statistics and as a tool for financial analysis. 


Electronics in Business. A Case Study in Planning: Port of New York Authority, Series 
III, No. 3. Herbert F. Klingman. New York: Controllership Foundation, Inc., 1956. Pp. 
xviii, 122. $4.00. Paper. 


Supplement No. 1, Electronics in Business, A Descriptive Reference Guide, Series III, 
No. 4. Florence A. May, Editor. New York. Controllership Foundation, Inc., 1956. Pp. 
viii, 130, $3.00. Paper. 


Frep W. Braga, Detroit Edison Company 


HIS is a case study in the administration, planning, and research necessary for 
"T catablishing an electronic data processing program. The study has broad applica- 
tion as a guide to managers responsible for appraising the practical and economic 
potentialities of electronic data processing in their respective businesses. 

The book relates frankly and objectively the experiences of the Port of New York 
Authority. Disappointments, mistakes, and reverses are set forth as candidly as 
achievements ‘and successes. The fundamental principles delineated and the problems 
and pitfalls enumerated are the same principles and problems which any businessman 
will meet in similar electronic data research and planning projects. This account of 
pioneering research in the application of electronic data processing will help to 
smooth the trail for those who follow. A very real contribution has been made by 
Port of New York officials and the writer of this book. 

A by-product of this report is a case study in fundamentals of industrial procure- 
ment policies and practice, with special reference to capital equipment and electronic 
personnel. 

This comprehensive and able presentation consists of seven sections. The introduc- 
tory section summarizes the history, organization, and operation of the Port of New 





686 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


York Authority. Section II depicts the early (1949) phases of electronic data research 
for the Port Authority. Section III reviews past results and outlines a plan for future 
development. Section IV reviews the organization and personnel staffing of an elec- 
tronic programming bureau. The remaining sections describe specific phases of the 
Port’s research, planning and development. 

Supplement No. 1 supplements the earlier reference guide issued July 1955, bringing 
pertinent information up to date through December 31, 1955. It sets forth the char- 
acteristics of available electronic data processing machines, and also contains a rather 
complete reference to articles that have been published in the electronic data process- 
ing field. It is an excellent guide for locating source material. 


Urban Mortgage Lending: Comparative Markets and Experience. J. E. Morton. Prince- 
ton: Princeton University Press for the National Bureau of Economic Research, 1956. 
Pp. xx, 187. $4.00. 


Rosert 0. Harvey, University of Illinois 


HIs study is apparently the final report on the National Bureau of Economic 

Research project on urban real estate finance. The National Bureau studies on 
urban mortgage financing include the following Bureau publications: R. J. Saulnier, 
Urban Mortgage Lending by Life Insurance Companies, 1950; Carl F. Behrens, 
Commercial Bank Activities in Urban Mortgage Financing, 1952; C. Lowell Harriss, 
History and Policies of the Home Owners’ Loan Corporation, 1951; Ernest M. Fisher, 
Urban Real Estate Markets: Characteristics and Financing, 1951. Morton has summa- 
rized, contrasted, and compared the findings of each of these books and has also drawn 
from an unpublished Bureau report on savings and loan association mortgage lending 
by Edward E. Edwards and from John Lintner’s book on Massachusetts mutal sav- 
ings banks.' The Bureau studies have covered the period from 1920 to 1947; Morton 
adds to the original material by including periodically reported statistical data on 
mortgage lending through 1952 and data from the Bureau of the Census 1950 Survey 
of Residential Financing. 

The foreword and introduction are supplied by R. J. Saulnier, Director of the 
Financial Research Program, NBER. Saulnier’s introduction is a good summary of 
Morton’s observations and conclusions and tends to consolidate the contributions 
from the urban real estate finance project into a recognizable point of view. 

The first three chapters are a description of the supply of mortgage funds. The 
nonfarm mortgage debt, the structure of the urban mortgage markets, and the struc- 
ture of the lending industry are described. The fourth chapter indicates charac- 
teristics of outstanding mortgages while the fifth reports on lending experience. 

Morton has been particularly careful to describe the problems and limitations 
of the samples of data from the various types of lenders and to caution about the 
deficiencies of the samples. Appendix A is a detailed review of the sample program 
and a description of biases. Appendix B includes the questionnaires and instructions. 
The detailed outline of the sampling method and problems are worthwhile and will 
be useful to anyone wishing to undertake similar research on institutional lending. 
Appendix C presents periodic data on mortgage finance, and Appendix D is a com- 
parison of the results of the National Bureau reports and the Bureau of the Census 
Survey of Residential Fipancing. 

Morton has summarized the materials very well and gives a satisfactory description 
of the characteristics of mortgage markets and lenders. The book contributes a great 





1 Mutual Savings Banks in the Savings and Mortgage Markets. Boston: Harvard University Division of Research, 
1948. 








BOOK REVIEWS 687 


deal toward explaining the position of the residential mortgage in the nation’s 
credit structure. 

Observations on lending policies indicate that lenders have consistently exercised 
choices contrary to their interests whenever choices were available. For example, the 
best experiences of the 1930’s were on the types of loans which are now most fre- 
quently protected by federal loan insurance or guaranty. The loans which probably 
carry the greatest risks are the ones made predominantly on a conventional basis. 
Lender decisions on interest rates and risks have been poor according to the study. 
To quote from p. 12 of Saulnier’s introduction, “Our basis for testing the success or 
failure of mortgage lenders in making such adjustments is far from perfect, but the 
record does show that differences in contract interest rates as between groups of 
loans were as often as not the opposite of what would have been necessary to adjust 
for differences in eventual loss rates.” 

The principal determinant of mortgage loan experience according to the evidence 
presented appears to be the phase of the business cycle in which the loan is made. 
The closer a loan is made to a major downturn in consumer income and in real estate 
values, the greater the chance that it will end in default. Along with this conclusion 
there is the implication that there is little or nothing in the terms of a loan which 
permit it to be identified as either potentially good or bad. 

The data made available through the surveys do not permit an identification of 
what constitutes a “good loan.” It should be recognized that inadequate information 
in lender files prevents a reliable analysis of the relationships between loan charac- 
teristics and experience. If anything at all is proved by the experience phase of the 
project it is that the traditional attitudes toward “safe” loan terms are not reliable. 

The author’s descriptions of loan terms and experiences and the observations in 
the introduction about loan experience tend to give a bias against high loan to value 
ratios and long terms. Saulnier’s comment on p. 11 that “ .. . the experience on the 
more liberally designed loans was less favorable than on those of a more conservative 
cast,” deserves some comment as to “why.” 

Morton’s sentence, on p. 102, “Thus, the data would imply that loans of large origi- 
nal amount, in particular those with high loan-to-value ratios, and on which repay- 
ment had been relatively small at the time of foreclosure, presented the greatest fore- 
closure risks” implies a tendency to default. In connection with this point, it is notice- 
able that Morton does not relate back to an observation made early in the book (p. 
16) that a family residence for credit purposes is most like a consumer good. Obvious- 
ly, then, the amount and stability of family incomes are important factors in loan 
experience. (Saulnier makes this point on p. 10, but it is not given consideration in 
the discussion on loan experience.) Since loans opposite in characteristics to those de- 
scribed in the above quotation are presumably less risky, it is appropriate to speculate 
on whether the terms themselves are responsible for the reduced tendency to default 
or if the probable lower ratio of loan servicing requirements to family income warded 
off default. If the latter conclusion has merit, it would no longer be appropriate to call 
a loan of relatively long contract maturity “liberal” unless that loan also had a pay- 
ment to income ratio in keeping with allegedly “good, conservative loans.” 

The results of foreclosures described in Chap. 5 establish that foreclosure is a losing 
proposition. Accordingly, anything which can be done to avoid foreclosure should be 
profitable for the lender. 

Both Morton and Saulnier were commendably reluctant to claim insight as to 
what would produce the best loans. Perhaps it is not too forward for an interested 
observer to derive a conclusion from the data: It is helpful in the event of foreclosure 





688 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


to have loans with low ratios to value on properties whose distress prices surely will 
not fall below the amount of the loan outstanding, but it is even more helpful to pre- 
vent defaults in the first place by making loans on whatever terms are necessary to 
establish payment requirements which surely will not exceed the borrower’s capacity 
to pay. 


Suburbanization of Service Industries Within Standard Metropolitan Areas. Raymond P. 
Cuzzort. Oxford, Ohio: Scripps Foundation for Research in Population Problems, Miami 
University, and Population Research and Training Center, University of Chicago, 1955. 
Pp. vi, 71. $1.05. Paper. 


Davip M. Buank, Columbia Broadcasting System, Inc. 


ope volume is the latest in a series of reports of the Scripps Foundation, dealing 
with patterns of metropolitan growth. An earlier study analyzed the suburbaniza- 
tion of manufacturing activity within standard metropolitan areas. This study under- 
takes similar analysis for service industries, based on data from the 1939 and 1948 
Censuses of Business. 

The author first describes the spatial distribution of service industries, both within 
and among standard metropolitan areas. He then, in his major effort, attempts to 
determine (and measure the importance of) the major independent variables that 
affect (a) the proportion of service activity in suburban areas (or, more precisely, in 
those portions of standard metropolitan areas outside of central cities), and (b) the 
rate of change of service activities in the suburbs. 

In this latter phase of the work, the author works with data for the 125 SMA’s 
that had a population of at least 100,000 in 1940. The basic technique is cross-sec- 
tional and utilizes multiple regression and multiple covariance analysis. 

The early descriptive portions of the work are useful and informative. The results 
of the analytic sections, however, seem somewhat barren. In these latter sections, 
the author has conscientiously carried through both simpie and multiple regression 
analyses involving (a) four measures of the relative importance of service activity in 
the suburbs and nine independent variables, presumed to determine such relative 
importance, and (b) five measures of change in service activity in suburban areas 
and seven independent variables, presumed to determine such change. Nearly all 
the independent variables are demographic in nature. The major conclusions are 
found to be limited to the following: 

1. The per cent of total service receipts in the suburbs is highly correlated (at the 

zero order level) with the per cent of retail sales in the suburbs. 

2. When all the independent variables are included in a multiple regression analysis, 
the per cent of service activity in the suburbs is significantly associated with 
the per cent of suburban population, the per cent of population in the rural 
part of the suburbs and the median family income of the SMA. Of these, the 
dominant factor is suburban population, which alone accounts for about three- 
quarters of the variation in suburbanization of service activity. 

3. Less than half of the variation in the change in service activity in the suburbs is 
accounted for by any variable or group of variables. 

We are then left with these two (hardly striking) conclusions: the degree of sub- 
urbanization of service activities in Standard Metropolitan Areas is positively related 
to the degree of suburbanization of retail trade and population. 

The limited contributions of the analytic portion of this study to our knowledge of 
the suburbanization movement (particularly in view of the arsenal of statistical 
measures used) raises several questions in the mind of this reviewer. One is whether 





BOOK REVIEWS 689 


this kind of cross-sectional analysis is really capable of making a substantial contri- 
bution, in light of its necessary dependence on those broad (and rather crude) demo- 
graphic and economic measures that are available for all areas to be studied. Even 
more important is the question of whether such an analysis can be fruitfully under- 
taken without at least a speaking acquaintance with the knowledge that has already 
been accumulated in the fields of urban land economics and urban location. 

On this last point, here are two of a large nuraber of examples. First, the author 
found that the central city typically accounted for a larger proportion of the SMA’s 
total service receipts than it did of the SMA’s population. To explain this phenom- 
enon, the author, at various points, suggests (a) that the central city may have 
maintained its “dominance” over the suburbs, (b) that transients in the central city 
may account for part of this result, and (c) that there may be a time lag in the devel- 
opment of economic activity in the suburbs. The second and third reasons are prob- 
ably correct (the first is either tautological or meaningless). But the most important 
reason, and one well-known to researchers in the urban field, is that there is a sub- 
stantial difference between the “daytime” population and the “nighttime” (or 
residential) population of central city areas. Large numbers of people who live in 
suburban areas enter central cities every day to work, shop, and be entertained. Thus, 
any attempt to determine why there are as many barber shops in Manhattan as there 
are, that excluded from its frame of reference the several million non-residents that 
enter the central area of Manhattan daily, would be almost useless. It is true that the 
Population Census (from which the demographic data used in this study were ob- 
tained) does not provide such information, but this merely raises again the question 
of whether such crude measures of urban activity as are available for all metropolitan 
areas are sufficient to provide the basis for serious analytic work in this field. 

Second, the author throughout provides data and performs analyses on total serv- 
ice activity, as well as its four components: personal services, business services, 
automobile repair services and garages, and miscellaneous repair services. He dis- 
covers early in the work that business services are considerably less suburbanized than 
other service industries and, at several points, tentatively suggests that perhaps the 
“highly centralized character of business services is attributable to a tendency on 
the part of business services to be patronized by a population of business establish- 
ments rather than the general population.” 

It should have been clear to the author at the outset that there is no reason to ex- 
pect business services (or, indeed, these segments of automobile and miscellaneous 
repair services that are utilized by business firms) to have the same locational pattern 
or to be subject to the same locational determinants as consumer services. After all, 
almost half of the total receipts of business service establishments, the author in- 
dicates, are accounted for by advertising agencies! Even if this fact were not known 
in advance, the author recognizes at an early stage that the locational] pattern of bus- 
iness services differs markedly from that of the other service activities. Yet he per- 
sists throughout in describing and analyzing total service activity, inclusive of 
business services. There can be no objection to an analysis of the relation of demo- 
graphic factors to business service location alone, although the results could be ex- 
pected to be (and indeed are) minimal. It is, of course, more logical to analyze per- 
sonal services and their relation to demographic factors, as the author does. But to 
perform the same analysis on total service activity, defined in such a manner that 
at least one-fifth of such activity has no direct linkage to consumers, simply results in 
continual blurring of whatever relationships do exist between consumer-oriented 
service establishments and such demographic factors as are used in the analysis. 





690 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


Structure of Indian Industries. M. M. Mehta. Bombay: Popular Book Depot, 1955. Pp. 
xxvii; 340. Rupees 22/8. 


Nei H. Jacosy, University of California, Los Angeles 


HIs book, to which V.K.R.V. Rao, Director of the Delhi School of Economics, has 
‘Twila an illuminating foreword, is a scientific contribution of major importance 
to students of the Indian economy. The product of six years of intensive work by 
Mehta in India and in the United States, it presents the results of an inductive in- 
quiry into the size, location, and integration of “industrial units” in seven Indian 
industries—cotton, jute, sugar, iron and steel, coal, paper, and cement. 

The first part deals with trends in the size of industrial units, and with relations 
between size, profit rates, and cost of production. The second part treats the degree 
of localization (i.e. regional concentration) of the industries, and with inter-regional 
and inter-unit differences in rates of profit and in production costs. The third part 
deals with trends in the managerial and financial integration of industrial units in 
the selected industries. While the study extends over the first half of the present 
century where data are available, financial analyses go back only to 1938. The book 
is in the tradition of P. Sargant Florence, to whose works on industrial organization 
Mehta pays a tribute. The author has also drawn inspiration from publications of 
the Federal Trade Commission, the Temporary National Economic Committee, and 
of Blair, Crum, Hoover, Thorp, and other American students. 

Because American statisticians and industrial economists will be as much interested 
in the author’s concepts and methods as in his substantive findings, we shall empha- 
size the former in this brief review, which necessarily cannot penetrate deeply into all 
aspects of this massive work. 

Mehta does not state the grounds on which he selected the seven industries for 
study. Presumably, their great importance in the industrial sector of the Indian 
economy and - e ready availability of information about them were the determining 
factors. They ure among the older Indian industries with comparatively well-estab- 
lished production technologies. Because the author believes his work provides an 
important factual basis for public policy, it should be pointed out that the tendencies 
exhibited by the selected industries are not necessarily those of the whole business 
population of India, in which such newer and more dynamic industries as petroleum, 
machinery, chemicals and electronics are now playing an increasingly important role. 

The unit of enterprise to which the author’s data refer is neither the individual 
plant, the business firm, nor the “establishment” of the United States Census. It is 
the “industrial unit,” defined as a plant or group of plants or any productive group 
under one ownership and situated in the same industrial area. Plants under the same 
ownership but situated in different areas are treated as separate “industrial units” 
in each area. While there is merit in regarding ownership as the distinguishing crite- 
rion, the “industrial unit” has ambiguities when, as is the case in India, many enter- 
prises under different ownerships come under the management of one “managing 
agency” which often treats all of its clients in one industry as a single business un- 
dertaking. 

In measuring the size of an industrial unit, Mehta rejects invested capital, total 
assets, or number of employees and chooses “technical equipment” as the preferable 
criterion, e.g., the number of spindles or looms installed in the case of cotton firms, 
or rated annual productive capacity in tons in the case of firms in the other industries. 
While any one criterion of size has its drawbacks, the difficulties of dealing with aged 
or obsolete vs. modern equipment installed in a plant (problems to which the author 
does not allude) make it doubtful whether the amount of “technical equipment” really 











BOOK REVIEWS 691 


is superior, especially when its use severely limits inter-industry comparisons of the 
size of firms. 

Mehta finds there has been a persistent tendency toward growth in the size of the 
modal industrial unit in the selected industries during the past half-century—a con- 
clusion which will not surprise American readers. Although the modal firm has become 
larger, firms of relatively small capacity continue in existence in each industry and 
the size-range continues to be surprisingly wide. After presenting frequency distri- 
butions of the size of Bombay cotton mills in 1905 and in 1951, the author asserts that 
“there is a tendency for industrial units to grow out of their humble beginnings” (p. 
26). While this is probably true, the data presented would not prove it unless they 
referred to distributions of identical units at two points of time, and it is not clear that 
they do. One is also troubled by the possibility that some of the apparent increase in 
size of the typical firm may be illusory, because productive capacity rather than actual 
output is the criterion of size and older firms tend to build up stocks of older equip- 
ment, some of which is kept idle on a stand-by basis. 

A more fundamental limitation, however, is that figures on the number of in- 
dividual plants operated by each industrial unit are lacking, as are frequency dis- 
tributions of the size of plants over the half-century under study. Such data would 
have shown the extent to which the growth in size of industrial units was due to (1) 
changes in the technology of physical production which were producing plants of 
larger size, and (2) changes in the technology of marketing, financing, and manage- 
ment which were increasing the number (as well as size) of plants which could be most 
efficiently operated under one ownership. This is obviously an important distinction 
for public policy. 

When he comes to study the relation between size and profitability of industrial 
units, Mehta is well aware that he treads on thin ice because of the deficiencies of 
profit data. American economists aware of the pitfalls in Statistics of Income of United 
States corporations may smile ruefully at Mehta’s complaint that he lacks for India 
such a “complete, authentic, and adequate” source of' information regarding capital 
employed and profits earned by business units of different sizes and types (p. 78). 
Realizing that reported profits mean much different things for corporations of dif- 
ferent sizes, because of the great importance of entrepreneurial salary withdrawals in 
small corporations, they will question the assertion that errors in the profits reported 
by particular businesses may be ignored because “units of all sizes are equally affected 
by them” (p. 81). 

After applying correlation analysis, the author concludes that the profit rate 
(defined as the percentage of net income to the book value of shareholders’ equity) 
is positively and significantly correlated with the size of industrial units in all indus- 
tries excepting cotton. This finding squares with his previous conclusion that in- 
dustrial units have tended to become larger, presumably in order to achieve the higher 
profits associated with the economies of scale. Nevertheless, many small units earned 
high returns, especially in the cotton industry; and the very largest units in each 
industry appeared to realize lower returns than those just under the pinnacle of size. 
The average reported profit rate during the decade 1938-47 varied from 6.9 per cent 
in the sugar industry to 15.0 per cent in the cotton industry; but the author properly 
denies that the figures reflect true differences in the economic returns upon invest- 
ment in the different industries, because of differences in their accounting practices, 
degree of state regulation and protection, intensity of competition, and other factors. 

To ascertain whether large units are more efficient than small ones, the author also 
examined relationships between size and unit costs of production—including raw 
materials, wages, stores, power, rent, taxes, depreciation, and selling (and adminis- 








692 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


trative?) expenses. He was obliged to rely upon financial statements for 1948 supplied 
by only 220 concerns. His conclusion is plausible that the percentage of production 
costs to value of goods sold showed a clear tendency to decline as size increased; but 
this does not necessarily demonstrate superior efficiency on the part of the larger firms, 
when one considers that these are the financial results for only one year of a rather 
small sample of firms. 

Less ambiguous data make Mehta’s studies of the degree of localization of Indian 
industry more satisfactory than his analysis of profitability. Employing Florence’s 
concepts of the location quotient, the coefficient of localization, and the coefficient 
of linkage, he finds that the regional distribution of productive activity in the 
seven Indian industries is extremely uneven, both absolutely and in relation to the 
distribution of the industrial population as a whole. The jute and paper industries 
were concentrated in Bengal, sugar in the United Provinces and Bengal, cotton in 
Bombay, cement in the Central Provinces and Bihar, and iron and steel and coal in 
Bihar and Bengal. 

During recent years productive activity in the seven Indian industries has been 
dispersing more widely as a result of new mineral discoveries, new power sources, the 
cheapening of transportation, and the growing mobility of labor. Regional differences 
in the realized profit rates appear not to have played a major role in this movement. 
Mehta’s studies of regional variations in profit rates turn out inconclusively; inter- 
regional variations in profit rates really reflected industrial differences in profit rates 
combined with local industrial specialization. Yet the emergence of new economic 
forces in India since independence has created new profit prospects in the various 
regions which will spur on the movement toward decentralization of industry—a 
movement being actively sponsored by the central government, as this writer found 
when visiting India in 1955. 

Mehta’s analysis of managerial and administrative integration in the seven in- 
dustries will be highly interesting to all those concerned with public policy. “The 
most striking feature of India’s industrial development,” he writes, “has been the 
concentration of ownership and control in fewer hands and in fewer establishments” 
(p. 247). The managing agency has operated as a mechanism for both vertical and 
horizontal integration of the control of individual industrial units. The author de- 
plores the fact that in 1951 managing agencies administered more than 600 units, of 
which 250 were under the control of nine British agencies and 201 were under the 
control of eight large Indian agencies. More than one-third of the productive ca- 
pacity of the Indian cotton mill industry was controlled by 30 managing agents; 
about 40 per cent of the number of jute mills were in the hands of four agencies; and 
there were comparable concentrations in the sugar, coal, and cement industries. More 
than 90 per cent of iron and steel capacity was controlled by Martin-Burn and 
Company and by Tata Industries, Ltd. Concentration of control has been increasing 
during the present century, although Indian managers and directors have strongly 
tended to replace British during recent years. 

Mehta appears to draw from these figures, and from tabulations of multiple and 
interlocking directorships held by industrialists, the conclusions that concentration 
of ownership and control of Indian industry is exceptionally high, and that it has 
passed beyond the point where competition could be effective. These inferences are 
not, however, justified. In the first place, the study concerns the older and more 
settled industries in which processes of integration have had—in the absence of effec- 
tive antitrust actions by the government—more time to mature, and in which inte- 
gration is probably relatively pronounced. Even so, when cast up against the “con- 
centration ratics’’ for American manufacturing industries, the Indian figures suggest 





BOOK REVIEWS 693 


merely oligopolistic types of industrial markets in which competition could be effec- 
tive, if there were active rivalry among the several managing agencies. 

Most of Mehta’s measures of concentration refer to number of companies rather 
than to the proportions of sales or assets under the control of managing agencies in 
each industry. The over-all concentration of corporate ownership of assets in Indian 
industry appears to compare favorably with that in the United States. According to 
a study seen by this writer, the 28,532 joint stock companies registered in India in 
1954 reported a total “gross block’’ (original expenditures on land, buildings and 
equipment) of 12,484 millions of rupees. The 333 companies managed by the 22 
largest managing agencies had a “gross block” of 2,918 millions of rupees, or 23 per 
cent of the total. This may be compared in a rough way with the approximately 31 
per cent of the total essets of all American non-financial corporations held by the 361 
companies with assets of $50 million or more in 1948. To make this comparison is 
not, of course, to suggest that the government of India should not be concerned with 
concentration of industrial ownership and control. The year 1956 marked the pas- 
sage of India’s first general antitrust legislation. It is to be hoped that this will be 
followed by other measures to energize private enterprise and competition. Educa- 
tion programs designed te enlarge the supply of trained entrepreneurial and mana- 
gerial talent, in which India is lacking, would be especially helpful. 

Mehta’s study was undertaken to answer important questions regarding the evo- 
lution of Indian industry. Within the limitations of the available data he has cast 
much new light upon the structural tendencies in evidence up to the early Fifties. 
The second Five Year Plan of economic development of India, which became effec- 
tive April 1, 1956, emphasizes huge public and private investments in a drive for 
industrialization. This work will provide valuable guidance to both public officials 
and private entrepreneurs by suggesting types and channels of investment which 
promise the largest social returns. 


British Industry, 1700-1950. Walther G. Hoffmann. (tr. W. H. Chaloner and W. O. 
Henderson) Oxford: Basil Blackwell, 1955. Pp. xxiii, 338. 35s. 


WituraM N. Parker, Resources for the Future, Inc. 


Ei wanae. well-known measurement of production trends in British industry, 
Wachstum und Wachstumformen der Englischen Industriewirtschaft von 1700 bis 
zur Gegenwart (Jena, 1940) has now become available in English in a revised version. 
The revision consists of a thorough overhaul of the footnotes, some additional text by 
Hoffmann, and a brief treatment of some indexes for the period 1935-1950. Hoff- 
mann’s own indexes are not themselves extended beyond the 1935 date covered in 
the German edition. 

The book consists of four major sections, an Appendix on the sources of the data, 
and a full set of charts and tables in which the series are set forth. The four sections 
of the text are devoted to (A) an index of physical output for the entire period 1700- 
1935, (B) indexes for the various industries, (C) evidence of a twenty-year cycle in 
the data, (D) evidence of a “life-cycle” in the individual industries over the whole 
period. The text concludes with a brief examination of secular change in other sectors 
of the British economy in the period and with a useful summary of the methods and 
results of the entire study. In each of the four sections, the statistical method em- 
ployed in handling the data is described before the conclusions are presented. 

Production indexes for 54 individual “industries” are calculated for at least some 
portion of the period. The coverage of the 7 indexes for the 18th Century is estimated 
at 47 per cent of total industrial output in 1740; this figure is around 70 per cent for 
the periods after 1812. For the combined index seven weighting periods are chosen 





694 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


and separate indexes with base year weights are linked. Weights are computed by 
estimates of value added, based on British or American sources, and on wage data. 
A significant feature of the original data is the extensive use of what a grammarian 
would recognize as metaphor and synecdoche: the use of an available series to repre- 
sent another not available, and the use of substitute series, allowing a part (e.g., 
coal shipments to London) to stand for the whole (coal production before 1854). 
Since actual output data for manufactured items are very scarce, except for items 
subject to the excise taxes, the output of ores, crude metals, timber, and wool ad- 
justed by trade statistics is used to represent output of various lines of manufacture. 
In Sections C and D, the individual series are smoothed twice by ten-year geometric 
moving averages. In Section C cycles are measured and compared in the smoothed 
series; in Section D the trends in the annual percentage change in the smoothed series 
are analyzed. Whereas in Sections A and B the “mean coefficient of expansion’”’ over 
a long period is taken from the slope of a straight line fitted by least squares to the 
logarithms of the data, in Section D the whole array of percentage increases from 
year to year is examined for a trend pattern among the various industries. ; 

This study represents perhaps the most comprehensive attempt to measure and 
analyze the movement of industrial output in a national economy over so long a 
period. Considerable significance has been attached to its conclusions by British eco- 
nomic historians, despite some criticism of the data and methods. The period 1780- 
1877 reveals in the index annual rates of growth of two to four per cent, with rates 
under two per cent before and after. Within this period, the years before 1855 show 
generally higher rates, except for the interval of the Napoleonic Wars, than the pe- 
riod 1856-1876. Analysis of the rates for individual industries show significant differ- 
ences between the rates for consumer goods and producer goods, between durables 
and non-durables, between export-oriented and home market-oriented industries, 
etc. The cyclical analysis shows a fluctuation of approximately twenty years period 
in most of the series and considerable similarity in the timing of the phases of the 
individual series. The analysis of the growth by industry shows a pattern of growth 
and decline in nearly all cases, although the annual growth rates themszlves appear 
to show in some cases a continuously falling trend. 

The Hoffmann index of industrial production has received rather wide use since 
the publication of the German edition of this book and of an earlier article (Welt- 
wirtschaftliches Archiv, 1934: II). Widest currency was given to it by its incorporation 
into the League of Nations indexes, published in Industrialization and Foreign Trade 
(Geneva, 1945). In one version, it was used too by Rostow in British Economy of the 
Nineteenth Century (Oxford, 1948) and appears with certain criticisms in the Rostow, 
Gayer, Schwartz volumes, The Growth and Fluctuations of the British Economy, 1790- 
1850 (Oxford, 1955). Most recently it has been pressed into service by Phelps-Brown 
and Handfield-Jones in a notable article, ‘‘The Climacteric of the 1890’s: A Study of 
an Expanding Economy” (Ozford Economic Papers, October, 1952). Criticism of its 
use in that article has been made by D. J. Coppock in a critical note in The Man- 
chester School (January, 1956). An earlier criticism was made by W. A. Lewis in 
The Manchester School (January, 1952). Taking his stand on the data of the produc- 
tion censuses of 1907 and 1924, Professor Lewis contended that the Hoffmann index 
understated the post-war growth by as much as 20 per cent. Coppock, making refer- 
ence to the earlier study of Tolles and Douglas (‘A Measurement of British Indus- 
trial Production,’ Journal of Political Economy, 38 (1930)) agreed with this conclu- 
sion. If the Hoffmann index has a strong and continuing downward bias, the case for 
a declining rate of growth may prove as ephemeral in the British economy as in the 
American. The case for a downward bias in the index must rest mainly on the funda- 





BOOK REVIEWS 695 


mental objection made by Coppock to the extensive use of input data to represent the 
output serics. Increases in efficiency in the use of materials and in the proportion of 
value added for all industries together are concealed by the reliance on input data. 

This last criticism should be subjected to closer scrutiny than it has yet received. 
Some of the growth of manufacturing output relative to raw materials production 
comes from a transfer of manufactures (e.g. food processing) from households to in- 
dustrial establishments. In iron and steel, it is not impossible that the basic materials 
are not much more highly fabricated on the average than they were before structural 
and automobile steel became important uses. Except for the use of scrap, efficiency in 
the use of inputs has grown less for such materials as metal, wood, cotton, wool, than 
for fuel, machinery, and labor. This fact taken alone would tend to give the index an 
upward bias. It is true, however, that the share of value added has risen markedly in 
available Census data, and the criticism is probably justitied. 

To this criticism, one is tempted to add criticism on a few minor points. Some may 
feel that the index, covering 235 years, is presented with insufficient caveats about the 
dangers of comparison over long periods. The validity for short periods depends on the 
choice of period and weighting years, and the shifts in the weights, as Gerschenkron 
has most recently emphasized, are as important a measure of the industrial trans- 
formation as the index itself. One might wish thus that it had been possible to pub- 
lish the sources in more detail, particularly those for the table of weights (pp. 18- 
19). (Perhaps Kuznets has spoiled us all a bit on this score.) On p. 9 the exposition is 
not at all clear, and on pp. 181-2 it is not clear whether the diagram is intended to 
represent a continuously falling rate of output as it appears to do. Hoffmann quite 
properly does not attempt any comprehensive explanation of the phenomena he has 
described. His suggestions—of a twenty-year replacement cycle and of certain his- 
torical causes of the falling rate of growth after 1880—go farther toward eliciting 
curiosity than toward satisfying it. 

Such criticism cannot detract from the significance of Hoffmann’s achievement. 
Comparison with Burns’ Production Trends in the United States since 1870 (New York, 
1934) inevitably suggests itself, not only because of the approach, but also because 
of resemblance in the conclusions. Together the two studies have established the 
possibility of measuring the growth of industrial output for significant historical pe- 
riods. They have posed a problem for business cycle analysis. And most important 
for our present concerns, they have given empirical content to the model of a growing 
economy composed of “industries” in various stages of development. To give this 
model theoretical articulation, and to set it within a broad framework of explanation 
is the task of modern economic history. Usher has written: “Statistical analysis of 
economic and social phenomena carries one rapidly toward substantial analysis of 
historical process.”’ The two must not be confused, but in performing the first, Hoff- 
mann has set limits and direction for the second. The translation of his book is an im- 
portant stimulus to this work; that the translation should have been done with such 
skill and the editing with such care was to have been expected by anyone familiar 
with the same translators’ earlier edition of Schlote, but should not go unmentioned 
by a grateful reader. 


Determining the Business Outlook. Herbert V. Prochnow, Editor. New York: Harper & 
Brothers, 1954. Pp. xi, 445. $6.50. 


Rosert E. Snyper, The Prudential Insurance Company 


TS is a collection of nineteen articles designed for businessmen, college students, 
and others interested in business indicators and methods of short term business 
forecasting. The articles, specially written for the book, were contributed by such 





696 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


well known economists as Jules I. Bogen, Ewan Clague, Walter Hoadley, and L. R. 
Klein. Each article gives major sources of statistical data and each includes a compre- 
hensive bibliography on the topic under discussion. 

The opening article deals with the role, general methods, and difficulties of business 
forecasting. Next comes an article on the meaning, measurement, and causes of the 
business cycle. This is followed by discussions of such business barometers as gross 
national product, savings, department store sales, the Federal Reserve Index of 
Industrial Production, wholesale and retail prices, and employment. There are also 
chapters on specific industries including construction, automobile manufacture, and 
public utilities. Another group of articles covers a variety of fields such as the stock 
market, foreign trade, Treasury operations, and long term forecasting. 

Limitations of space are bound to cause gaps in the treatment of a subject as 
broad as this one. It does seem, however, that a book of this kind might include a dis- 
eussion of short term consumer credit in view of its role in postwar consumption, 
and the effects on general business conditions if credit repayments exceed new ex- 
tensions. On the general subject of statistical data—the raw material of this book— 
more frequent mention might have been made about their limitations in respect to 
reliability. Nor is mention made about the lag in publishing current statistics, or the 
frequent revisions in certain series which affect their direction or the timing of turning 
points. 

Aside from these points, however, the book achieves its purposes of describing the 
major measures of business activity and illustrating forecasting techniques. In addi- 
tion, the information and interpretation which are provided make it a useful eco- 
nomic primer. The flow of economic activity is separated into its secular, cyclical, 
seasonal, and random components. Clear and concise explanations are given of the 
meaning and significance of GNP and savings, both as economic concepts and as 
business indicators. Those interested in the economic process will get some idea of 
the interdependence between general business activity and such important areas as 
construction, agriculture, and automobile manufacture. There is a good survey of the 
role and methods of the Federal Reserve in counteracting cyclical swings. And the 
far-reaching influence of the Federal Government is further pointed up through a 
discussion of Treasury operations and their effect on the economy. 


Trends and Cycles in Economic Activity. William Fellner. New York: Henry Holt & Co., 
1956. Pp. xiv, 411. $5.00. 


Bert F. Hoseuitz, University of Chicago 


His book presents, on the level of an advanced text, the model of secular econom- 
- ic growth which was enunciated on the basis of Keynesian theory first by Roy 
Harrod and E. D. Domar and later developed by a number of other writers. Feliner 
does not add any basically new insights to this body of theory. His book excels, 
nevertheless, by the unquestioned erudition of the author, by the clear and precise 
exposition, and the careful analysis of interrelations between secular growth trends 
and cyclical disturbances. 

The major propositions of the work may be summarized as follows: Assuming a 
balanced budget and absence of direct intervention by the government in the pricing 
or allocation mechanism, the uninterrupted growth of an ecomony depends upon the 
capacity of the system to guide savings into anticipated investments of equal magni- 
tude while maintaining, at the same time, a fair degree of stability of the general 
price level. In other words, Fellner stipulates as a “joint condition of dynamic equi- 





BOOK REVIEWS 697 


librium,” an “approximate stability of the general price level” coupled with “equality 
of desired net capital formation with net savings at positive levels” (p. 190). This 
joint condition of dynamic equilibrium is satisfied if three corollaries are met: (1).a 
sufficient number of improvements (i.e., technological innovations) offsetting the 
tendency of decreasing returns; (2) gradualness of change in productive or industrial 
structure associated with sufficient factor mobility to adapt without undue friction 
to changes in structure; and (3) an efficiently functioning monetary and credit sys- 
tem to avoid bottlenecks or oversupplies from the monetary side. 

Since Fellner assumes that the rate cf capital accumulation exceeds that of growth 
of the labor force or of the supply of natural resources, technical improvements, es- 
sentially of a labor and resource saving kind must intervene to overcompensate the 
tendency towards diminishing returns. In this connection Fellner also discusses capi- 
tal coefficients and their relative magnitude at different stages of growth. The 
postulate of sufficient resource mobility is required to avoid bottlenecks and break- 
downs arising from overspecialization of productive factors, and the third corollary 
points to the need of avoiding an imbalance between money supply and real output 
by stipulating that neither should become scarce or overabundant relative to the 
other. Whereas Fellner presents a very adequate discussion of the institutional re- 
quirements of the third corollary, adding a magnificent sketch of the historical de- 
velopment of “financial crises’ during the last 150 years, his discussion of bottlenecks 
presented by insufficiently mobile resources is disappointing. Above all, he does not 
adequately discuss the problem of differences in skills required by various forms of 
specialized laborers and the problems of complementarity and/or substitutability 
presented by highly specialized human resources. 

All three corollaries are concerned with the avoidance of relative scarcities and 
none is explicitly oriented to the sufficiency of effective demand. But Fellner argues 
that given his assumptions of a neutral role of government and adequate discounting 
of business risks, the successful overcoming of relative scar<ities produces a situation 
in which net investment and net savings will be equal at stable prices and that thus 
aggregate effective demand will be “just right” to maintain the system in dynamic 
equilibrium (p. 236). 

This stipulation of the three corollaries also contains the causes of cyclical fluc- 
tuations of the system. An economy will deviate from the path of dynamic equi- 
librium if one of the corollaries is not met. Thus cyclical fluctuations may be pro- 
duced because of failures of the process of technological improvement from function- 
ing properly, because of imperfect mobility or overspeciclization of productive fac- 
tors, or because of imbalance between the flow of money and of real output. Fellner 
shows that in the historical experience of western countries imbalances of all three 
kinds have appeared at various intervals and have produced the cyclical movements 
with which we are familiar. He thinks, however, that these elements of instability 
should not be exaggerated at present, since we dispose of a considerable number of 
built-in stabilizers which tend to maintain the economy on or close to its long-run 
growth trend. 

Though Fellner does not discuss any problems which in themselves present sta- 
tistical reasoning, his work suggests a number of problems of measurement in which 
the use of statistical techniques is required. Primary among them is the question of 
empirically measuring the level of dynamic equilibrium at various stages of the 
secular growth process, and subordinated to it the empirical determination of such 
magnitudes as capital coefficients, rate of improvements, or even rate of capital for- 
mation. Simple reflection will show that these concepts represent a substantial chal- 





698 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


lenge to our ingenuity, both as concerns their operational definition, but even more as 
concerns their measurement. Yet without some estimate of their magnitude at dif- 
ferent stages of the growth process or their difference as between different industries 
or classes of industries, our understanding of long-run economic growth is seriously 
deficient. 


Characteristics of the Low-Income Population and Related Federal Programs, selected 
materials assembled by staff of the Subcommittee on Low-Income Families, 84th Congress, 
lst Session. Joint Committee on the Economic Report. Washington, D. C.: United States 
Government Printing Office, 1955. Pp. xii, 240. Free. Paper. 


A Program for the Low-Income Population at Substandard Levels of Living. 1955 Report 
on Economic Statistics. Reports Nos. 1311 and 1309 of the Joint Committee on the Economic 
Report to the Congress of the United States, 84th Congress, 2nd Session. Washington, 
D. C.: United States Government Printing Office, 1956. Pp. iii, 14; iii, 22. Free. Paper. 


D. Gaus Jounson, University of Chicago 


HE first two publications are quite similar to two other publications, each dated 
{bpd years earlier, of the same committee.' The collection of statistics and descrip- 
tive materials in 1955 has included 100 more pages than in 1949, but it is not clear 
that the passage of six years and the addition of 100 more pages of material has 
added substantially to our knowledge of characteristics of low-income families and 
individuals. 

With the exception of one study reported in the 1955 collection, families and 
persons are classified as low income on the basis of the income of a single year. When 
this is done, many persons are in the low-income group because of incomes that are 
temporarily low. Many young people who are working only part time or who have 
held jobs less than a year, but whose long-run income prospects are quite satisfactory, 
thus are included in the low-income group. Likewise, family heads who have suffered 
some unemployment may be so included. Numerous other temporary factors leading 
to relatively low incomes in a given year could be mentioned. The point is significant 
because the use of income data for any one year as a criterion for defining low income 
means that we are unable to determine the characteristics of the families and indi- 
viduals whose incomes are generally low. The one effort to avoid some of the pit- 
falls of placing exclusive reliance upon income data for a single year has been made in 
a study, not yet published, by the Franklin D. Roosevelt Foundation. This study, 
based on data from the 1950 Survey of Consumer Expenditures, used information on 
the relation between total expenditures and disposable incomes, the allocation of 
total expenditures among various major components, and information on fluctuation 
in income to determine the group that should be classified in the category of low 
economic status. In this way many families with temporarily low incomes were ex- 
cluded while other families with temporarily high incomes were included, properly, 
in the low economic status category. 

While the recommendations in the two policy reports are arranged somewhat dif- 
ferently, the similarity in the recommendations is quite marked, which is not too 
surprising since two of the three authors of the 1956 report (Senators Sparkman and 
Flanders) were two of the four authors of the 1950 majority report. The recommenda- 
tions, in my opinion, are worthy of consideration and if implemented would make a 
significant contribution to the solution of some of the problems faced by low-income 
groups. 





1 Low-Income Families and E. ic Stability, Materials on the Problem of Low-Income Families, assembled 
by the staff of the Subcommittee on Low-Income Families, Joint Committee Print, 8lst Congress, Ist Session, 
1949, and Low-Income Families and Economic Stability, Report of the Subcommittee on Low-Income Families of the 
Joint Committee on the Economic Report, Senate, 81st Congress, 2nd Session, Document No. 146, 1950. 








BOOK REVIEWS 699 


Philadelphia Workers in a Changing Economy. Gladys L. Palmer. Philadelphia: Uni- 
versity of Pennsylvania Press, 1956. Pp. xiv, 189. $6.00. 


Howarp W. Jounson, Massachusetts Institute of Technology 


TUDENTS of the labor market are familiar with the series of Philadelphia labor 

force and employment studies published by a group of Wharton School researchers 
beginning in 1921 and particularly since 1929. The total of these studies undoubtedly 
adds up to a more detailed analysis of the Philadelphia labor market than exists for 
any other area of the country. The book serves an extremely useful purpose in sum- 
marizing, in a sense, this series and in examining the data against the long-run evi- 
cence of substantial change in the character of the Philadelphia labor market. In 
addition, Palmer relates the data, for the first time, to broad questions of economic 
development and planning involved in the study of the labor market. 

The volume opens with a brief examination of the historical data describing Phila- 
delphia as a manufacturing city. The third chapter compares the positional growth 
of the city with the other principal manufacturing centers of the country and very 
briefly reviews the demand changes for labor in several specific industries from World 
War I to the present. The fourth chapter summarizes demographic data related to 
the labor force, and relates changes in the data over time to mobility of labor in the 
area. The next two chapters are an integration of the several excellent studies in the 
series related to the effects of the depression and the war prosperity on the labor mar- 
ket. The final chapter broadly relates the Philadelphia experience to the growth pat- 
terns of metropolitan communities, the changing character of the urban centers, 
problems of city planning, and general patterns of labor market behavior. The vol- 
ume includes a series of appendix tables on which the material is largely based. It also 
includes a brief technical appendix. 

That the book wil! be helpful to the economist interested in regional and area labor 
market data is certain. It represents perhaps the most intensive survey of the chang- 
ing characteristics of a labor force in a metropolitan area over time. It is unlikely to 
be matched in the near future by a study with the same range and scope. The sub- 
stantial achievement of the author is that she has captured the dynamics of a metro- 
politan market with its simultaneous functions of stability and change. Unlike most 
of the earlier studies that comprise the series, this is not a study of a particular in- 
dustry or of a particular segment of the labor force. All the more remarkable, then, is 
its success in keeping the analysis at the level of over-all market interaction in terms 
of changes in the product market, demographic factors, relationships with other 
market areas, and general level of business activity. 

As can be expected in the case of a volume of this broad compass, a few inadequacies 
can be noted. It is clear that the book is intended as a general summary for a some- 
what general audience including economists, planners of cities and mobilization proc- 
esses, union and industry representatives concerned with recruitment and employ- 
ment, and product market analysts. As such the statistician will occasionally be 
disappointed by the necessary brevity of the statistical treatment of some topics, 
for example, labor mobility. Occasionally, one is surprised by minor flaws in a piece 
of typically careful work. (For example, the support of a statement that job tenure in 
particular companies tends to be longer in Philadelphia than in other cities is built 
on companies in western cities rather than in eastern. In another instance, Phila- 
delphia is described as being characterized by small and moderate sized companies 
to a greater extent than other cities, but this is substantiated only on a highly selec- 
tive basis.) Finally, there is some typical problem of dealing with a changing concept 
of what constitutes the Philadelphia market during the 50 years most under ques- 





700 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


tion. In 1900, the boundaries of the market were the city itself. By 1940, the market 
obviously comprised most of the 8 counties of the Standard Metropolitan Area. The 
reader is occasionally not quite sure what area the data for a given year describes 
and whether the comparisons involved are, indeed, the appropriate ones. By and 
large, however, the difficulties of dealing with a changing geographical base for the 
market are very effectively handled. 

The final chapter of conclusions raises, as any good research summary chapter 
should, even more questions than it answers. Most important are those related to 
problems of organization of the city base as a more effective labor market. “It is a 
truism,” as Henry Schultz long ago observed, “‘to say that economic phenomena are 
not static but dynamic,”’ and this, in terms of the labor market, is as well described in 
this small volume as it is anywhere in the literature. 


Resource Productivity, Returns to Scale, and Farm Size. Karl O. Heady, Glenn L. John- 
son, and Lowell S. Hardin, Editors. Ames, Iowa: Iowa State College Press, 1956. Pp. xi, 
208. $3.50. 


J. B. Hassuer, University of California (Berkeley) 


—_ proceedings of a meeting of the North Central Farm Management Committee 
on Farm Scale and Resource Productivity, held in Chicago, October 19-20, 1954, 
are contained in this book. Twenty-two basic papers and seven discussion papers are 
classified into five major sections. These sections can be categorized approximately as 
follows: (1) Framework for the Problem (economic theory, models, and some meas- 
urement techniques related to production functions) ; (2) Some Historical Accomplish- 
men.s (budgeting, business records, and farm surveys); (3) Some Problems and 
Procedures (technical considerations for budgeting, linear programing, simultaneous 
equations, aggregation, classification, and sampling with respect to scale analysis); 
(4) Illustrative Examples (including model modifications); (5) Individual and Group 
Values in Farm Management Analysis (personal values and institutional structures 
as elements of the problem). 

Although all of the papers are worthy of careful study, this reviewer would par- 
ticularly recommend 1, 7, 8 (Heady), 2, 3, 9 (G. L. Johnson), 20 (O. R. Johnson), and 
Jensen’s discussion to 21 to research workers in this field. The papers by Heady deal 
with the economic theory of production functions, certain logarithmic and poly- 
nomial models, linear programing and budgeting, and certain technical relations 
both in theory and for empirical results. G. L. Johnson’s papers cover the unique na- 
ture of the management process (and its role as an input), the conditions on acquisi- 
tion and salvage values that determine the fixed or variable economic status of an 
asset, and problems of classification and aggregation for inputs and outputs. O. R. 
Johnson’s paper considers, very sharply, relevant aspects of the small farm both from 
the viewpoint of resource productivity in a direct sense and from a broad social level. 
Jensen’s paper has primary value for its support of analytic methods to provide in- 
formation to assist in decision making and for cautioning such analysts against as- 
suming that their results possess unquestionable accuracy or moral dictums for 
society. 

Some of the mathematical terminology used occasionally by Heady might make 
the purist wince, but the context usually clears the meaning. One specific technical 
error on page 75 should be pointed out even though it leads to conservative positions 
upon application. For a population regression function I =a+ $8C+éC?, having 
M =dI/dC =8+28C and sample estimator M’=b+2dC, the interval estimate of M 
for C=C* is not given by (b+tays5s) +2C%(d +tao5a) =(b+2C*d) $taj2(S,4+2C*S,) 





BOOK REVIEWS 701 


but by (6+2C*d) +tas5axr where S*xy is the unbiased estimate of om’? =052+4C% Ce 
+4C*oycaprg. Since C is non-negative pog will be negative and certainly not equal to 
unity as would be required for Heady’s statement which, consequently, results in 
intervals of excessive length. 

Now, to a general appraisal. The book (plus references) gives the reader an under- 
standing of the current status of researeh in farm production-function analysis al- 
though the coverage of programing and illustrative examples is far from adequate. It 
must be conceded that the problem is exceedingly complex and that solutions so far 
are primitive and lack essential detail for application, therefore this book should set 
up a challenge for research workers to improve this area of economic analysis. More 
adequate results will only come after certain current deficiencies are handled more 
realistically. These deficiencies include such problems as the following: (1) More 
detail and less aggregation must be secured in general. (2) The time dimension must 
be handled more adequately. A farm production function is not a single entity with 
input and output rates consistent with a single time period for rate definition. Rather, 
it is a dynamic sequence of intertemporally related subproduction functions having, 
in most cases, a different time-»eriod basis for each one and generally simultaneous 
dependence. Programing tends to recognize this problem more adequately than most 
production function analyses have. (3) Carelessness in using input sources in lieu of 
the effective inputs of such sources must be avoided especially when the source is a 
“large” discrete unit capable of supplying the effective input(s) at a variable rate up 
to capacity. (4) Much intensive research must take place at the subproduction func- 
tion level. Accurate underlying data on input-output coefficients are critically 
needed. (5) The effects of climatic conditions on the various subproduction functions 
should be studied carefully. Such information could improve decision making as a 
particular season progressed and could give some indication of specific risks in the 


long run. Effects of varying the timing of certain activities should also be considered. 
(6) Problems of capital accumulation and rationing should be considered in any dis- 
cussions of scale changes. (7) Integration of the physical production function analyses 
(or programing), with adequate macroanalyses for determining relevant prices for 
inputs and outputs, must be accomplished if the research is to be of much value as 
an ex-ante planning device. 


Mathematical Models of Human Behavior. Proceedings of a Symposium. Stamford, 
Connecticut: Dunlap and Associates, 1955. Pp. vii, 103. Paper. Free when available.* 


James S. Coteman, University of Chicago 


HIS slim paperbound volume reports the proceedings of a symposium on mathe- 
matical models'in social science, sponsored by Dunlap and Associates in 1954. 
The papers can be roughly grouped as follows: six papers in the area of decision theory 
and game theory; two papers on stochastic learning models; one paper on scaling 
models in psychology, and one paper concerned with distributions of accidents. .‘hus 
the book concentrates largely on decision theory, with a slight admixture of work in 
other areas. 
Three of the decision-theory papers are strictly concerned with individual decision- 
making in gambling situations, stemming from the von Neumann-Morgenstern 





* In reply to an inquiry, the technical librarian of Dunlap and Associates wrote on 21 June 1956 ‘This report 
is not for sale. It was sponsored by the Department of Defense and has been distributed at no cost to research 
workers and libraries. Unfortunately, we have nearly exhausted our present supply of copies. At the moment we 
are keeping all requests on file pending a decision to reprint and are advising requestors of this action with a further 
recommendation to consult one of the libraries to which deposit copies were sent.” 








702 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


axioms for utility under risk. Harry Markowitz reexamines the phenomenon of 
simultaneous gambling and insurance-taking by the same individual, and comes up 
with a more plausible utility curve than did Friedman and Savage, who first dis- 
cussed this. Murray E. Jarvik discusses in a fairly loose vein the importance of dis- 
tinguishing between an individual’s estimate of the probability of an event and his 
willingness to gamble on the event. Ward Edwards’ paper takes some actual steps in 
this direction, reporting the results of extensive experiments in gambling which are 
fitted to a model which includes subjective probability as well as a utility function. 

Moving a little away from decision-making under risk, Jacob Marshak attempts 
to combine some of the principles developed in the theory of decisions under risk with 
psychologists’ work on paired comparisons to obtain a theory of riskless choice. A 
paper by Merrill Flood reports results of group preference experiments in which an 
attempt was made (unsuccessfully) to connect up with a game theoretic model. In a 
contribution to the theory of games itself, Duncan Luce’s paper presents a new con- 
cept of coalition formation for the solution of an n-person game. 

Of the two short papers concerned with learning theory, Robert Bush’s is a note on 
the use of stochastic learning models with more than two response categories. William 
Estes, using his own learning models as an example, discusses the necessity of inter- 
pretation of parameters, suggesting that little has been gained merely by obtaining a 
fit and identifying the parameters of a model. The paper on psychological scaling 
models, by Clyde Coombs and R. C. Kao, is a summary discussion of the kinds of 
“composition” models which ean be inferred to underlie psychological response data. 
The remaining paper, by Herbert Jacobs, develops a rather complex stochastic model 
for accident distributions, building on previously-developed statistical models of 
contagion. 

The volume is closed by remarks by Paul Lazarsfeld, on the use of mathematical 
models in social science. Lazarsfeld poses questions which the substantive social 
scientist raises and the model-builder must answer if his work is to be accepted as a 
contribution to social science. 

Taken as a whole, this volume is not an important one. The authors are major 
contributors to their respective fields, but most of these papers are merely short 
notes or fragments of work they have presented better elsewhere. Thus the work 
gives no more than a glimpse into what some mathematical model-builders are doing 
in social science, and at that a glimpse primarily into work on decision theory.. This 
was probably the modest goal of the symposium, and nothing more should be ex- 
pected from it. 


The Industrial Mobility of Labor as a Probability Process. Isadore Blumen, Marvin Kogan, 
and Philip J. McCarthy. Ithaca, New York: Cornell University, 1955. Pp. 163. $3.00, 
paper; $4.00, cloth. 


Rosert R. Busu, New York School of Social Work, Columbia University, 
and Bernarp P. Conen, Harvard University 


HE aim of this study is to “explore the possibility of finding a probability model 

that will describe the flow of workers through the industrial structure of the 
United States.” The major interest is in the prediction of long-term industry-to- 
industry movements based on the analysis of short-term movement. While the au- 
thors do not offer a model which completely describes this phenomenon of industrial 
mobility, they have made a substantial beginning. In view of the complex factors 
involved in industrial mobility, it is remarkable that a few restrictive assumptions 
can even approximately describe the process. 








BOOK REVIEWS 703 


Data for this study were from a ten-percent sample of the Continuous Work His- 
tory Sample of the Bureau of Old Age and Survivors Insurance, which is itself a one- 
percent sample of workers in employment covered by Social Security. Employee- 
industry summary data were obtained for the years 1947, 1948, and 1949; thus the 
study included more than 49,000 individuals who were in covered employment in 
one of these three years. Because Social Security reports are filed quarterly, the data 
cover twelve quarters of employment experience. The models utilize information on 
the average movement in adjacent quarters to predict movement in two quarters 
separated by longer intervals of time. 

The authors classify industries into eleven code groups based on the two-digit 
industry classification, such as: agriculture, construction, metal, and are concerned 
with movement betw~er code groups. In addition, they present data on the differences 
in movements as a function of age and sex and on the basis of this analysis decide to 
treat three age groups for each sex separately. That is, because movement decreased 
with increasing age and because men moved more than women, it was desirable to 
examine the models separately for males and females and for ages 20-24, 40-44, and 
60-64 within each sex. While the same formal model might apply to each of these 
groups, the parameters would differ from group to group. 

The authors are careful to note several problems in the use of the data. Two diffi- 
culties stand out; the first concerns the classification of industries, whereas the 
second involves the types of data which are not collected by the Bureau of Old Age 
and Survivors Insurance. The code groups chosen are to a large extent arbitrary. On 
the surface there appears to be more similarity among industries within a code 
group than between code groups, but it is possible that the code groups are too 
heterogeneous for this type of analysis. As the authors point out, “.. . if one uses a 
classification scheme which groups disparate industries, then the observed process 
will not appear to be of a Markov type even if it actually is of this type in some other 
appropriate classification scheme” (p. 152). Furthermore, this classification “loses” 
about twenty-five per cent of industrial movement, because that much movement 
takes place between industries which are in the same code group. 

The second problem is also related to heterogeneity. The authors treat age and 
sex separately, but there are other variables which affect the amount and kind of 
movement. They suggest that it would be desirable to stratify the sample by occu- 
pational groups and, of course, there are a number of other possible classifications. 
Unfortunately, most of the information one might use is not available in existing data. 

In Chap. IV, the authors describe and apply their “simple” model; industry code 
groups are identified with the states of a Markov chain having constant transition 
probabilities. (An additional state is identified with “not in covered employment.”) 
Elements of the transition matrix are estimated by averaging the data from all the 
adjacent pairs of time periods available. Iterates of the numerical matrix so obtained 
are then compared with the observed higher order transition matrices. In addition, 
the ergodic property of the process is used to predict the steady-state distribution of 
workers among industry code groups. The comparisons between model predictions 
and data are found wanting and so the authors consider more complex models. 

Individual differences almost always plague analyses of social processes. In an 
initial attempt to handle this problem, the authors describe a “modified” model in 
Chap. VI. Each worker is called either a “stayer” or a “mover”; a stayer is assumed 
to remain in a single state throughout the time of observation whereas a mover is 
assumed to behave according to the simple process already treated. They denote the 
transition matrix for the movers by M and let S be a diagonal matrix whose elements 
give the proportions of stayers in the several states. The transition matrix for all 





704 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


workers is then P=S+(I—S)M, where J is an identity matrix. The authors then 
show that the nth order transition matrix is P™ =S+(I—S)M"*, where M” is the 
nth power of M. (P™ #P*.) 

Elements of the matrices S and M cannot be observed directly and so an indirect 
method of estimation is developed by assuming that the process has nearly reached 
a steady state in eight steps. When ail the required estimates are obtained, compari- 
sons with data are once more made. The results show the modified model is a decided 
improvement over the simple model, but systematic deviations still exist. 

In a concluding chapter, some more general models are discussed but they are not 
applied to analysis of the data. The authors repeatedly point out the inadequacies of 
their models and make no pretense of having developed a satisfactory “theory” of 
industrial mobility. But in our opinion, they have made a major contribution to an 
understanding of the processes involved. The discrepancies between the models and 
the data point up features of the phenomenon which could not have been easily de- 
tected by more routine analyses. For example, any social scientist would conjecture 
that there are major individual differences in labor mobility, but Blumen, Kogan, 
and McCarthy find evidence from existing data for these differences, and furthermore 
have estimates of the magnitudes of the effects. 

This investigation offers further support for those who consider the application of 
probability theory a promising approach to social science problems. It is another 
addition to the growing literature on the application of the theory of Markov proc- 
esses to the description of social or psychological phenomena. This study of the 
industry-to-industry movement of workers supplements the attempts to develop 
models in such diverse fields as: attitude change, learning,? psychoanalytic displace- 
ment,’ conformity,‘ sociometric choice,’ language and information,* and animal peck- 
ing order.’ It is an excellent example of how a research team develops and applies a 
sequence of models to an important social science problem, and therefore should be 
of interest to statisticians as well as sociologists. 


Statistics of Therapeutic Trials. G. Herdan. Houston, Texas: Elsevier Press, Inc. 1955. 
Pp. xvi, 367. $10.50. 


Lincotn E. Mosszs, Stanford University Medical School 


HIS reviewer is unable to commend this book to either statisticians or clinicians. 
Some of the reasons appear below. 
First, the author’s view of an experimental trial appears to accord no central role 
to randomization. At no point did this reviewer find such an injunction as, “Take 
2m subjects, and with a table of random numbers, or a deck of cards, divide the sub- 





1 T. W. Anderson, “Probability Models for Analyzing Time Changes in Attitudes,” Mathematical Thinking in 
the Social Sciences (Paul F. Lazarsfeld, Ed.). Glencoe, Illinois: Free Press, 1954. 

2 R. R. Bush and F. Mosteller, Stochastic Models for Learning. New York: John Wiley and Sons, 1955. 

*R. R. Bush and J. W. M. Whiting, “On the theory of psychoanalytic displacement,’ Journal of Abnormal 
and Social Psyc* logy, 48 (1953) 261-72. 

« B. P. Cohen, “A Stochastic Model for Conformity,’’ paper presented at American Sociological Society Annual 
Meeting, Detroit, Michigan, September 1956. 

* C. P. Leeman, “Patterns of sociometric choice in small groups: A mathematical model and related experi- 
mentation,” Sociometry, 15 (1952) 220-43. 

*G. A. Miller, “Finite Markov processes in psychology,’ Psychometrika, 17 (1952) 149-67. 

G. A. Miller and F. C. Frick, “Statistical behavioristics and sequences of resp ."" Psychological Review, 
56 (1949) 311-24. 

E. B. Newman, “The pattern of vowels and consonants in various languages,” American Journal of Psychol- 
ogy, 64 (1951) 369-79. 

C. E. Shannon, “A mathematical theory of communication,” Bell System Technical Journal, 27 (1948) 379- 
423, 623-56. 

1 A. Rapoport, “Outline of a probabilistic approach to animal sociology,” Bulletin of Mathematical Biophysics, 
11 (1949) 183-96, 11 (1949) 273-81, 12 (1950) 7-17. 











BOOK REVIEWS 705 


jects into two groups, each of size m.” On p. 24 use of random numbers is mentioned 
(for the next to last time) as a legitimate substitute for assigning alternate cases to 
two treatments. One page later the author apparently indicates that randomization 
cannot be used where treatment and control groups are stratified by such variables 
as age or sex or duration of disease. The words of the text are, “The method of ran- 
dom numbers can only replace that of simple alternation. It is obvious that where 
compensating alternation is indicated, it cannot be used, since compensation must 
necessarily interfere with any random selection.” 

As this reviewer understands the book, it is held there that in testing the effective- 
ness of a treatment for a chronic disease (such as hypertension) there is no point in 
using groups of subjects because individual cases are so different from one another 
that matched groups could not feasibly be set up. For example, on p. 277 we find, 
“Although tuberculosis of the lung comes, fundamentally, under the heading of 
chronic diseases, yet the comparative frequency of the disease makes it here possible 
to form groups such as are required for the consideration of therapeutic effects in 
acute diseases. The mere formal possibility of working with groups would, however, 
not be a sufficient reason for us to apply these methods if tuberculosis of the lung 
did not actually present many features characteristic of acute disease.” 

One further facet of the book’s treatment of randomization is revealed in the fol- 
lowing statement, “As long as the clinical observer is himself not biased as to which 
treatment is better than the other, and as long as he does not wish to make the result 
come out one way rather than the other, there is no reason to regard the method of 
equalizing alternation as statistically unsound.” 

Second, the book omits much material which one might naturally expect to find 
in a 350-page text on medical statistics. There is no mention of: the analysis of vari- 
ance; such non-parametric tests as Wilcoxon’s, Smirnov’s, Friedman’s, etc.; the 
treatment of contingency tables with ordered categories. 

Third, there is a substantial number of non-trivial errors. A partial list would in- 
clude: 

p. 49. A statement that a very small value of the x? statistic for goodness of fit 
invites rejection of the hypothesis. 

p. 91. Use of binomial confidence limits for a fraction whose structure is not: num- 
ber of successes/number of trials. 

p. 172. In a problem involving fixed times of observation, “There is also a regres- 
sion line, and therefore a regression coefficient, of time upon the characteristic. . . .” 

p. 186. A standard error of forecast is proposed; it leads to a band of constant width. 

p. 254. From a sample of means a (highly dubious) confidence interval for the gen- 
eral mean is offered as a tolerance interval for individual cases. 

p. 299. A contingency table where each subject gives two responses is treated as if 
all observations were independent, rather than by the method of Bowker, Cochran, 
McNemar, etc. 


Current Research in Human Fertility. Papers Presented at the 1954 Annual Conference 
of the Milbank Memorial Fund. New York: Milbank Memorial Fund, 1955. Pp. 162. 
$1.00 Paper. 


IRENE B, TaruseEr, Office of Population Research, Princeton University 


HE altered focus of fertility research and the integration of some modern statistical 
developments into conceptualization, design, survey, and analysis are apparent 
in the studies reported to the 1954 Annual Conference of the Milbank Memorial 
Fund. Detailed evaluation would be premature, for these are reports of preliminary 
analyses or of studies planned or in process. Perusing the studies as a group, how- 
ever, it is obvious that the approsch to fertility analysis in terms of social and psycho- 





706 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


logical factors presents many and intricate problems in adherence to sample designs, 
instruments of measurement, and techniques of measuring interrelations. 

The first section, “Studies of Underdeveloped. Areas,” begins with a report by C. 
Chandrasekaran on the United Nations-Government of India Population Survey 
undertaken in Mysore State. Here there is abundant evidence of the difficulties that 
inhere in samples so chosen as to be feasible for field operations, in probing for intimate 
attitudes and motivations among illiterate or poorly educated people, and in drawing 
conclusions with some degree of generality from small numbers of cases where biases 
are partly known and partly conjectural. Julia Blake’s report on the Conservation 
Foundation’s study of “Family instability and reproductive behavior in Jamaica” 
includes no detail other than that sampling procedures were used in the selection of 
the 99 women and 53 of their mates who were interviewed. It should be noted, though, 
that this appears to be an exploratory survey. Reuben Hill, Kurt Back, and J. 
Mayone Stycos report on “Family action potentials and fertility planning in Puerto 
Rico” is a part of a family and fertility project under way since 1951 under the 
auspices of the Social Science Research Center of the University of Puerto Rico. The 
samples are described, the hypotheses stated and classified, the correlations of inde- 
pendent and dependent variables given, and a typology of familism presented and 
tested. 

Three of the studies in the second part, “Studies of Sweden and the United States,” 
are more purely demographic in approach and method. Norman B. Ryder utilizes a 
cohort approach to the influence of declining mortality on marital durations and 
reproductivity in Sweden. Wilson Grabill reports on the progress of a fertility mono- 
graph to be included in the 1950 Census Monograph series. Dudley Kirk utilizes 
diocesan figures from the Official Catholic Directory to assess recent trends of Catholic 
fertility in the United States. P. K. Whelpton’s report on “A study of the ‘expected’ 
completed fertility of a national sample of white women” is a note on the rationale 
and procedures in a study of the “Growth of American Families” being undertaken 
by the Survey Research Center of the University of Michigan and the Scripps Foun- 
dation for Research in Population Problems. Data on expectations, attitudes to fer- 
tility, and control practices were secured from a national sample of women as a basis 
for interpreting recent increases in births, isolating the timing factor in current fer- 
tility, and securing more adequate bases for predicting the future of fertility. 

The third part concerns a new study of social and psychological factors in fertility 
being planned under the auspices of a steering committee of the Milbank Memorial 
Fund. Clyde V. Kiser states the objectives and the broad areas of interest, together 
with the relations of this study to the earlier Indianapolis study. Elliot G. Mischler 
and Charles F. Westoff outline concepts and hypotheses. In view of the preliminary 
and tenative character of the proposal in 1954, there was no discussion of study de- 
sign or measurement. There are concluding observations on the problems of method 
and study design by Philip M. Hauser. 

The Milbank Memorial Fund’s volume of collected papers from its 1954 Confer- 
ence thus has limited value as a depository of methodological or substantive studies. 
It may have permanent value as marking a major transition period in the development 
of more adequate and meaningful research on human fertility. 

Trends and Differentials in Mortality. Papers presented at the 1955 Annual Conference 


of the Milbank Memorial Fund. New York: Milbank Memorial Fund, 1956. Pp. 165. 
$1.00. Paper. 


GerorceE F. Marr, Smith College 
I. 1s not uncommon for demographers describing their field to point out its inherent 
elegance in being based on only three fundamental elements, fertility, mortality, 
and migration. That there is room for a substantial variety of interests and ap- 





BOOK REVIEWS 707 


proaches in the study of ever one of these three factors is amply demonstrated, how- 
ever, by the volume under review. Since limitations of space preclude detailed atten- 
tion to all the parts of this highly diverse collection of ten papers and two discussions, 
the paragraphs that follow will attempt to indicate briefly what is included, treating 
certain papers at greater length than others, and will ccmment in passing on only a 
few of the matters worth discussing. 

The book opens with three complementary, though slightly overlapping, papers 
on underdeveloped areas. In the first, Jean Bourgeois-Pichat and Chia-Lin Pan sum- 
marize the limited recent data on trends and determinants of mortality in such areas, 
noting particularly the steep improvements observed in certain cases and suggesting 
that public health programs apnear to be the chief explanation for these develop- 
ments. George J. Stolnitz then reports that a comprehensive study of the mortality 
experience of Western countries shows the recent performance of underdeveloped 
areas to be quite unprecedented and lends corroboration to the view that medical 
advances have been the precipitating element in mortality improvement, while bet- 
ter economic conditions have played an important but essentially permissive role. 
The question whether the marked improvements in mortality will be maintained is 
addressed by Marshal! C. Balfour and, in a discussion of Balfour’s paper, by John 
E. Gordon. Both authors are optimistic for the near future, though they are in less 
agreement about the long run. Both point out that future developments in mortality 
will depend in part on trends in fertility. 

The second section, devoted to highly developed countries, has quite a different 
character from the first as a result of the wealth of available statistics for the advanced 
areas. Whereas Bourgeois-Pichat and Pan had to devote most of their attention to 
crude death rates and expectations of life, Mortimer Spiegelman’s résumé of recent 
trends and determinants of mortality focusses more on specific causes of death. This 
theme is pursued in detail for one disease by Harold F. Dorn in a paper entitled 
“Ecological factors in morbidity and mortality from cancer,” which discusses the 
incidence of cancer by age, sex, and race, with particular reference to the part of the 
body affected (but, oddly, with little reference to ecology). Both Dorn’s paper and, to 
a lesser extent, the discussion and extension of it by E. Cuyler Hammond, have some- 
what the effect of building up interest in the question of smoking and then leaving the 
reader there, without any citation to the “available evidence . . . too extensive to 
summarize” (p. 89). Harold H. Marks’ paper on mortality among impaired lives is 
at the opposite extreme of documentation, taking nearly the form of an annotated 
bibliography. 

Some new results in the study of occupational and social differentials in mortality 
appear in a paper by Iwao M. Moriyama and Lillian Guralnick based on United 
States data for 1950. When non-agricultural workers are grouped into five occu- 
pational levels, the lowest level, laborers, exhibits far the highest mortality at all 
ages. The age-specific mortality curves of the other four classes cluster together, with 
considerable overlapping, but age-standardized mortality ratios indicate a tendency 
for mortality to rise continuously from the profeszional to the unskilled group. On 
the basis of the crude comparison that is possible between the statistics of the two 
countries, it appears that the occupational differentials in mortality are relatively 
much greater in the United States than in England and Wales. On the other hand, 
the narrowing of the differentials seems to be proceeding more rapidly in the United 
States. Readers will find in this article an implicit demonstration that the mortality 
statistics of highly developed countries are still capable of improvement. 

The third part of the volume, “Research in mortality,” begins with Ansley J. 
Coale’s summary of some of the findings of a more complex earlier paper (Milbank 
Memorial Fund Quarterly, January, 1956). Most important is the empirical generali- 





708 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


zation that improvements in mortality have typically followed a particular pattern, 
namely approximately equal percentage improvements at ages from 5 to 50 and much 
larger improvements at the younger an’: older ages. Since the changes have been such 
as to add more young persons than o.d to the population, especially because greater 
survival rates at the younger ages iead to more births, the cause of the “aging” of 
populations in the past is to be found not in their mortality decline but in fertility de- 
cline. 

Data from 158 life tables thought to be representative of world mortality experience 
in the present century are used by Vasilios G. Valaoras to derive a set of “standard age 
and sex patterns of mortality.” For each age group, the mortality rate (sq,) of a given 
life table is paired with the rate (924s) for the next older age group shown in the same 
life table, and a second degree parabola is fitted to the series of paired observations 
for the age group obtained from the whole set of life tables. A “standard pattern” of 
mortality is computed by starting with an arbitrary mortality rate for age zero and 
using the parabolic equations in turn to determine the mortality rate for each suc- 
ceeding older age. The reviewer has some reservations about the procedure used in 
handling the data so far as sex differentials are concerned, and about the dependence 
of the scheme on the infant mortality rate, but the 40 model life tables published 
will undoubtedly have considerable usefulness. (A fuller presentation of them appears 
in U. N. Population Studies, No. 22.) It would be interesting to examine whether the 
changes from one pattern to another are consonant with the findings of the preceding 
paper. 

The concluding paper, by Rupert P. Vance and Francis C. Madigan, 8.J., describes 
a research plan designed to increase knowledge about sex differentials in mortality 
by studying the members of certain American Catholic orders of teachers, among 
whom the style of life is nearly the same for men and women. Readers will enjoy 
the opportunity this preview affords them to think about the problems involved be- 
fore the results appear, but they would probably have appreciated a statement of just 
what data will be available for each of the individuals studied. 

It is obvious from the foregoing summary that this volume does not constitute a 
complete or neatly balanced presentation of the subject of mortality on either an 
introductory or an advanced level. It comes much closer to such a goal than might be 
expected of a collection of papers written for a small conference of specialists in dif- 
ferent but associated fields, and is characterized by the competence with which the 
authors have pursued their respective tasks. Like its predecessors in the series of 
publications of Milbank Fund Conferences, it will be a valuable addition to the 
library of anyone interested in the statistics of demography. 


Migration and Mental Disease: A Study of First Admissions to Hospitals for Mental 
Disease, New York, 1939-1941. Benjamin Malzberg and Everett Lee, with an Introduction 
by Dorothy 8S. Thomas. New York: Social Science Research Council, 1956. Pp. 142. $1.50. 
Paper. 


A. W. Marssatt, The RAND Corporation 


nis book begins with a long introduction containing a review of previous work on 
the relation of migration and mental disease. The following one hundred pages 
contain a competent and straight-forward analysis of mental patient first admission 
data for the State of New York, 1939-1941. Age specific rates of first admission are 
calculated by sex, color, and diagnosis (manic depressive, dementia praecox, and other 
psychoses) and by nativity (foreign or native, and for native born in New York or 








BOOK REVIEWS 709 


elsewhere in the U.S.), or recency of migration to New York (within last five years, 
earlier migrants, New York born). The net result of the study is to show that migrants 
to the state, and especially very recent migrants, have substantially higher (by fac- 
tors of 2 or 3) rates of admission for mental disease than do native born non-migrants. 
The cause or causes of these differences are unevaluated. The magnitudes of differ- 
ences in the rates by migrant class indicate that use of this variable as an additional 
control variable in epidemiological studies of mental disease may be important. 

In studies of this type it is not usual to find, nor do we find in this study, much 
use of statistical inference as such. Really statistical aspects of the problems are not 
treated. This is, of course, largely justified since the variances of the estimates of the 
age specific rates are in most cases undoubtedly quite small. The real problems are 
those of bias due to inaccuracy in the data or derive from conditions where the data 
are not appropriate to the questions under investigation. The difference between 
good and bad studies therefore is related to the sensibleness with which the data are 
used and not to the application of more or less advanced statistical techniques. Using 
these criteria the study is a good one. However, with regard to the question of bias 
I have one question. How good are the basic census data on migrant elements in the 
U.S. population, especially the most mobile elements? There is evidence that Negro 
males were substantially under-reported in the 1940 census.! How this result is re- 
lated to this special group’s migrant activities I do not know, but perhaps there is a 
similar under-reporting of migrant groups in general. No imaginable change in the 
population base could, however, sufficiently reduce the migrant (five years or less) 
rates to change any of the study’s substantive conclusions. This worry is, perhaps, 
therefore, only of small practical concern. Some use of the evidence mentioned above 
in calculating rates for male colored populations would nonetheless have been an in- 
teresting addition to the study. 


The Derivation of Rates of Separations from Mental Hospitals. J. W. Fisher and E. E. 
Clarke. Ottawa: Mental Health Division, Department of National Health and Welfare, 
Report Series, Memorandum No. 1, 1955. Pp. ii, 51. Planographed. Free. 


Harowp F. Dorn, National Institutes of Health 


HIs pamphlet describes methods for computing rates of separation from mental 

hospitals by standard actuarial techniques. Three rates are defined: (a) the prob- 
ability of decrement, (b) the absolute rate of decrement, and (c) the central rate of 
decrement. The computation of these rates is illustrated for cohort and census type 
populations with single and multiple causes of decrement. 

The pamphlet apparently was prepared for persons with little or no familiarity 
with actuarial or statistical techniques. The meaning and method of computation of 
decrement rates for cohort type populations are very simply and clearly described 
and illustrated by numerical examples. In contrast, the derivation and description 
of separation rates for census-type population are stated very concisely, and possibly 
may not be so easily understood by the type of person to whom the discussion of 
cohort-type populations is addressed. This is not a serious defect since a knowledge 
of the method of computation of decrement rates for a cohort-type population is 
sufficient for most purposes. This method of computation is to be preferred whenever 
a choice is possible. 





1 Ansley J. Coale, “The population of the United States in 1950 classified by age, sex, and color—A revision of 
census figures,” Journal of the American Statistical Association, 50 (1955), 16-54, see especially pp. 35 and 40. 





710 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


The authors emphasize the neressity for precisely defining the population to be 
studied, the classification of cases to be included, the interval of observation, and the 
kind of rates to be computed before data are assembled for a study of separations 
from a mental hospital. Preference is expressed for the absolute decrement rate on 
the grounds that the rate for a specified cause of decrement is independent of the 
rates for other causes of decrement. This is true only in a strictly arithmetical sense 
unless the causes of decrement in fact act independently, which is rarely so for most 


causes. 
The authors have capably accomplished their objective of preparing a simple man- 
ual describing the computation of separation rates for patients in mental hospitals. 





PUBLICATIONS RECEIVED 


Abramovitz, Moses. Resource and Output 
Trends in the United States since 1870. 
casional Paper 52. New York: National Bu- 
— of Economic Research, 1956. $0.50. 

‘aper. 
erican Council on Education, Com- 
mittee on the Evaluation of the Tyler Fact- 
Finding a Conclusions and Recommen- 
dations on a Study of the General Educational 
Development Testing Program. Washington, 

. C.: American Council on Education, 
1956. $1.09. Paper. 

Armsen, P. A New Form of Table for 
Significance Tests in a 2X2 Contingency 
Table. gears a England: University 
Press, 1955. 2s.6d. Paper. 

Atkinson, Thomas The Pattern of Fi- 
nancial Asset Ownership, Wisconsin Indi- 
viduals, 1949. Princeton: Princeton Univer- 
~ Press, 1956. $3.75. 

nary Roberto. Seri Hierosolymttana, 
Vol, III. Jerusalem: Hebrew University at 
the Magnes Press, 1956. Paper. 

Blair, G. W. Scott. Measurements of Mind 
and Matter. New York: Philosophical Li- 
brary, Inc., 1956. $4.50. 

Central Statistical Office, London. Na- 
tional Income Statistics: Sources and Meth- 
ods. London: Her Majesty’s Stationery Of- 
fice, 1956. 25s. 

Edwards, Corwin D. Big Business and 
the Policy of Competition. Cleveland: Press 
of Western rve University, 1956. $3.50. 

Edwards, Frederick, Editor. Readings in 
Market Research. London: The British 
— Research Bureau Limited, 1956. 


Fegiz, Pierpaolo Luzzatto. Jl Volto 
Sconosciuto Dell’Italia. Milan: Dott. A. 
Giuffre, 1956. Lire 6000. 

Hutchinson, E. P. Immigrants and Their 
Children, 1850-1950. New York: John Wi- 
ley and Sons, Inc., 1956. $6.50. 

Institut de Science Economique ‘—< 
quee. Niveaux de Develop et Poli- 
tiques de Croissance. Paris: Institut de Sci- 
ence Economique Appliquee, 1956. Paper. 

Institute of Life Insurance. 1956 Life In- 
surance Fact Book. New York: Institute of 
Life Insurance, 1956. Free. Paper. 

Katona, George, and Mueller, Eva. Con- 
sumer expectations, 1953-1956. Survey Re- 
search Series Publication No. 16. Ann Ar- 
bor: Institute for Social Research, Univer- 
sity of Michigan. 

smore, &: Energy Sources in Can- 
ada Commodity Accounts for 1948 and 1952. 
Reference Paper No. 69. Ottawa: Dominion 
Bureau of Statistics, Industry and Mer- 
chandising Division, 1956. $1.00. Paper. 


Mainland, Donald, He Lee, and 
ney Marion 3 ables for mens ~— Bi- 
nomial Samples—Conti: , Confi- 
dence Limits and Sam Size Estimates. 
New York: Dept. of Medical Statistics, 
New York University College of Medicine, 
1956. $2.00. Stiff cover; spiral binding; pho- 
to-offset. 

Marshall, Douglas G. Wisconsin’s Popu- 


lation—-Cha: and Age ve Madison: 
University of Wisconsin icultural Ex- 


periment Station, 1956. Free. ag 


Mauser, Ferdinand F., and 
David J., Jr. Introduction to American Busi- 
ness. New York: American Book Company, 
Octo Joseph F., and Coping 

cCloskey, Joseph F., and Co er, 
John M. Operations Research for Manage- 
ment. Volume II. Baltimore: The Johns 
Hopkins Press, 1956. $8.00. 

Melman, Seymour. Dynamic Factors in 
Industrial Productivity. New York: John 
Wiley and Sons, Inc., 1956. $4.75. 

Moroney, M. J. Facts from Gee. Third 
Edition. Itimore: Penguin ks, Inc., 
1956. $0.95. Paper. 

Paden, Donald W., and Lindquist, E. F. 
Statistics for Economics and Business, Sec- 
ond Edition. New York: McGraw-Hill Book 
Co., Inc., 1956. $4.75. 

Porterfield, James T. S. Life Insurance 
Stocks as Investments. Business Research 
Series No. 9. Stanford, California: Stan- 
ford University School of Business, 1956. 
$1.50. Paper. 

Rannells, John. The Core of the City: A 
Pilot Study of Changing Land Uses in Cen- 
tral Business Districts (Philadelphia). New 
sy Columbia University Press, 1956. 

5.50. 

Robertson, Sir Dennis H. Economic 
Commentaries. New York: John De Graff, 
Inc., 1956. $3.75. 

4 Martha E., Lilienfeld, Abra- 

ham M., and Pasamanick, Benjamin. Pre- 

natal and Paranatal Factors in the Develop- 

ment of Childhood Behavior Disorders. 

ae Ejnar Munksgaard Ltd., 1955. 
5. 


Russell Sage Foundation, Committee on 
Statistical Program for the ey fi of New 
York. A Statistical Program for the Depart- 
ment of Health of the City of New 
1956. $1.00. Paper. 

Samperio, Jose V. Montesino. La Pobla- 
cion del Area Metropolitana de Caracas. 
Caracas, Venezuela: Cuadernos de Infor- 
macion Economica, 1956. Paper. 

Smith, C. Frank, and Davies, George R. 
Calculus for Business. Dubuque, lowa: 


ork, 


711 





712 
Wm. C. Brown Company, 1956. $2.95. Pa- 


per. 

Snedecor, George (with Chapter 17 on 
Sampling by Cochran, Wm. G.). Statistical 
Methods, Fifth Edition. Ames, Iowa: Iowa 
State College Press, 1956. $7.50. 

Spurr, William A. Workbook in Business 
and Economic Statistics. Homewood, IIli- 
nois: Richard D. Irwin, Inc., 1956. $3.50. 

Stolnitz, George J. Life Tables from Lim- 
ited Data: Demographic Approach. 
Princeton: Office of Population Research, 
1956. $4.00. 

Thone, Georges, Editor. Colloque sur la 
Theorie des ombres. Liege: Georges 
Thone, 300 fr.; Paris: Masson and Co., 
2,400 fr. Paper. 

Titchen, Robert S., Rosenthal, Arnold J., 
Bollerman, Bruce, and Nistico, Frank, Edi- 
tors. Quality Control and Applied Statistics, 
Vol. I, No. 1. New York: Interscience Pub- 
lishers, Inc., 1956. $60.00 per year; binder, 
$5.00. Looseleaf. 

United States Congress, 84th Congress, 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


2nd Session, Subcommittee on Foreign Eco- 
nomic Policy of the Joint Ecoaomic Com- 
mittee. Defense Essentiality and Foreign 
Economic Policy (Case Stu i The Watch 
Industry and Precision Skills). Washing- 
ton: U. 8. Govt. Printing Office, 1956. Free. 

Vance, Lawrence L., and Neter, ohn. 
Statistical conrity for Auditors and Ac- 
countants. New York: John Wiley and 
Sons, Inc., 1956. $9.00. 

Waldo, Dwight. Political Science in the 
United States of America: A Trend Report. 
Paris: UNESCO, 1956. $1.00. Paper. 

West, Quentin, M., and Conklin, How- 
ard E. Results of Appiying a Simple Ran- 
dom Sampling Process to Farm Manage- 
ment Data. Memoir 345. Ithaca, N.Y: 
New York State College of Agriculture, 
1956. Free. 

World Health Organization, Division of 
Editorial and Reference Services. Annual 
Epidemiological and Vital Statistics, 1953. 
Geneva, Switzerland: Columbia University 
Press, 1956. $10.00. Paper. 








THE INDEX TO THIS VOLUME 


HAS BEEN REMOVED FROM THIS 


POSITION AND PLACED AT THE 
BEGINNING OF THE FILM FOR 


THE CONVENIENCE OF READERS 


all 





712 
Wm. C. Brown Company, 1956. $2.95. Pa- 


per. 

Snedecor, George (with Chapter 17 on 
Sampling by Cochran, Wm. G.). Statistical 
Methods, Fifth Edition. Ames, Iowa: Iowa 
State College Press, 1956. $7.50. 

Spurr, William A. Workbook in Business 
and Economic Statistics. Homewood, Illi- 
nois: Richard D. Irwin, Inc., 1956. $3.50. 

Stolnitz, George J. Life Tables from Lim- 
ited Data: Demographic Approach. 
Princeton: Office of Population Research, 
1956. $4.00. 

Thone, Georges, Editor. Colloque sur la 
Theorie des ombres. Liege: Georges 
Thone, 300 fr.; Paris: Masson and Co., 
2,400 fr. Paper. 

Titchen, Robert S., Rosenthal, Arnold J., 
Bollerman, Bruce, and Nistico, Frank, Edi- 


tors. Quality Control and Applied Statistics, 
Vol. I, No. 1. New York: Interscience Pub- 
lishers, Ine., 1956. $60.00 per year; binder, 
$5.00. Looseleaf. 

United States Congress, 84th Congress, 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1956 


2nd Session, Subcommittee on Foreign Eco- 
nomic Policy of the Joint Economic Com- 
mittee. Defense ny omy | and Foreign 
Economic Polic (Case Stu i: The Watch 
Industry and Precision Skills). Washing- 
ton: U. 8. Govt. Printing Office, 1956. Free. 

Vance, Lawrence L., and Neter, ohn. 
Statistical Sesepling for Auditors and Ac- 
countants. New York: John Wiley and 
Sons, Inc., 1956. $9.00. 

Waldo, Dwight. Political Science in the 
United States of America: A Trend Report. 
Paris: UNESCO, 1956. $1.00. Paper. 

West, Quentin, M., and How- 
ard E. Results of Applying a Simple Ran- 
dom Sampling Process to Farm Manage- 
ment Data. Memoir 345. Ithaca, N. Y.: 
New York State College of Agriculture, 
1956. Free. 

World Health Organization, Division of 
Editorial and Reference Services. Annual 
Epidemiological and Vital Statistics, 1953. 
Geneva, Switzerland: Columbia University 
Press, 1956. $10.00. Paper. 





LIST OF REVIEWERS 


Anderson, R. L. 

Anscombe, F. J. 

Arias B, Jorge . 

Arkin, Herbert . 

Baker,G. A. . . 
Balderston, F. E. . 
Bernstein, Peter L. 
Birnbaum, Allan 

Blank, David M. . . 
Bowman, Ward §&., Jr.. 
Beat; BENS eS 
Braga, Fred W. .. 
Brinegar, Claude 8. 

Brower, E.J. . . 

Brown, George H. . 

Budd, Edward C.. . 
Buechley, Robert W. . ae other 
Bush, Robert R. . .414, 661, 
Carvalho, Alceu Vicente De . : 
Cochran, William G. . 
Cohen, Bernard P. 
Cohen, Jerome B. . 
Coleman, James §.. 
Connor, W.S8. . 
Derrick, Lucile . 
Donnelly, T. G. 

Dorn, Harold F. 

Enke, Stephen. . 
Fry, Thornton C. . 
Garvy, George . 
Glover, K. F. 
tee 
Gordon, Margaret 8. . 
Gordon, R.A... 
Greenberg, B. G. 
Gretton, Owen C. . 
Guttman, Louis 
Handler, A.B... . . 
Harberger, Arnold C. . 
Harvey, Robert 0.. 
Hassler, J.B. . . 
Hauser, Philip M. . 
Heinze, Shirley J... 
Hendricks, Walter A. . 
Hoselitz, Bert F. 
Houseman, Earl E. 
Hultgren, Thor 
Jacoby, Neil H. 

Jaffe, A. J. . 

Jarrett, R.F. . . 
Johnson, D.Gale .. 
Johnson, Howard W. . 


539, 666, 


. 677, 


667 
657 
662 
663 
387 
180 


385 


Johnson, M. Clemens . 


Jones, Frank E. 
Jones, Lyle V. . 
Kahn, Robert L. 
Karlin, Samuel 
Kats, Leo . . . 
Kendrick, John “WV. 
Kershaw, Joseph A. 
Kester, Henry I. 


Kimball, Bradford F. . 


Kohn, Robert . 
Lacey, Oliver L. 
Lang, Kurt . 

Lee, Ivan M. 

Lee, Maurice W. 
LeNeveu, A. H. . 
Long, Clarence D. . 
Mack, Ruth P. 
Mair, George F. 
Marshall, A. W. 
Miller, M. H. . 
Mills, Edwin 8.. 
Mode, Elmer B. 
Moses, Lincoln E. . 


Noether, Gottfried E. . 


Parker, William N. 


Potter, Robert G., Jr. . 


Rashley, F. J. . 
Rice, Emmett J. 
Rios, Sixto . . 
Rogoff, Natalie 
Sarhan, A. E. ‘ 
Savage, Leonard J. 
Saxenian, Hrand 
Schiff, Eric . F 
Schweiger, Irving . 
Seiden, Esther . 
Silk, Leonard P 
Snyder, Robert E. . 
Solomon, David . 
Solow, Robert M. . 
Steiner, Peter O. 
Taeuber, Irene B. . 
Teichroew, D. .. 
Terry, Milton E. 


Trueblood, Robert M. 


Walker, Helen M. . 
Wallace, David A. . 
Wilk, M. B. 
Withey, S. B. 


Wright, Kenneth M. 


Wrong, Dennis H. 


721 


a a 
. 165, 380, 416 
. 178, 546 
. 407 
183 

176 

203 

377 

197 

405 

393 

410 

192 
<a 
. 199, 706 
. 708 
<i eee 
. 376, 660 
377 

704 

660 

693 

409 

404 

682 

204 

188 

671 

383 

535 

554 

. . 550 
- 205, 205 
. 544 
695 

416 

















INTRODUCTION TO STATISTICAL ANALYSIS, Second Edition 


By WILFRED J. DIXON and FRANK J. MASSEY, Jr., University of 
California, Los Angeles. In press 


An excellent revision of one of the most popular of general and mathematical 
statistics texts, With no calculus prerequisite, it has been adopted in a variety 
of situations ranging from math departments and business administration, 
to biology and agriculture. It presents the basic concepts of statistics in a 
manner which wili show the student the generality of the application of the 
statistical method. Both classical and modern techniques are presented with 
emphasis on the understanding and use of the techniques, rather than on 
mathematical development. Examples are drawn from such fields as agricul- 
ture, business, chemistry, engineering, medical research, and psychology. 


This new edition has brought much material up to date and has expanded the 
latter chapters to include several important topics. A chapter on probability 
has been added; the chapter on analysis of variance has been completely 
rewritten; the collection of tables has been expanded. The section on “Non- 
parametric Statistics” is the most complete of any general text, and the sec- 
tions on “Power of Tests” are unique at the elementary level. 


NONPARAMETRIC STATISTICS. For the Behavioral Sciences 
By SIDNEY SIEGEL, Pennsylvania State University. 330 pages, $6.50 


The first book-length treatment on nonparametric, or distribution-free, sta- 
tistics. It gives comprehensive coverage to the nonparametric statistical tests and 
measures of correlation, demonstrating their usefulness in research in the be- 
havioral sciences. It is written for the reader with no special training in mathe- 
matics, and is organized to serve as a reference work as well as a text. 


ELEMENTARY BUSINESS AND ECONOMIC STATISTICS 
By ALVA M. TUTTLE, Ohio State University, In press—ready in March 


An elementary, complete, and clearly written text on business statistics. Es- 
sential material is covered in an understandable way, with even the simplest 
steps in arithmetic included. Techniques are emphasized, with statistical charts 
and tables exceptionally thorough. The unusual simplicity of the book is due 
to its very careful explanation and illustration. 








Send for copies on approval 











McGRAW-HILL BOOK COMPANY, INC. 


330 WEST 42nn STREET, NEW YORK 36, WN, ¥, 





Please mention the Journal of the Amenican Statistica, Association in writing advertisers 





METHODS OF STATISTICAL ANALYSIS 
IN ECONOMICS AND BUSINESS 


E. E. Lewis 


Dr. Lewis has organized this effective introductory textbook into 
a series of relatively short sections, each devoted to a single topic 
and each followed by a set of illustrative exercises. This sectionai 
organization, giving a very complete coverage of the basic tech- 
niques of statistical analysis, allows the instructor to select exactly 
what he wishes to take up as well as to introduce any other desired 
illustrative material, Along with a careful description of the com- 
mon statistical calculations, the exact nature and purpose of the 
various statistical measures are given full consideration. The 
intricate subject of statistical inference is treated in a detailed and 
logical way. 


DESIGN AND ANALYSIS OF EXPERIMENTS 
IN PSYCHOLOGY AND EDUCATION 


E. F. Lindquist 


With this text the student and research worker in psychology or 
education will develop a thorough understanding of the basic 
principles of experimental design and analysis. He will be able 
to select or devise for himself the appropriate designs for specific 
situations, to modify or combine standard designs, and to analyze 
and interpret the obtained results. 


Consideration is given to almost every experimental design of any 
importance in psychological and educational research. Particular 
attention is given to mixed designs, trend analysis, and estimation 
of variance components in reliability studies. An Instructor's 
Manual is available. 


HOUGHTON MIFFLIN COMPANY 


Please mention the Journal of the Amenican S 4 in writing advertisers 











Inference oriented 


INTRODUCTION TO 

MODERN STATISTICS 

With Applications to Business and Economics 

by WERNER Z. HIRSCH, Associate Professor of 
Economics, Washington University 


Presupposing no mathematical knowledge beyond 
algebra, this basic text for students of business znd eco- 
nomics introduces the student to modern inference meth- 
ods and gives him a sound understanding of the tools and 
ideas of statistics, particularly as they apply to decision- 
making. 


Comprehensive and stimulating, this text... 


%& introduces each problem by a concrete example drawn 
fror: actual experience and shows what methods are 
applicable and how they help solve the problem 


%& presents modern statistical concepts and their uses in 
an intelligently satisfying manner and, at the same 
time, enlivens the subject with actual business ex- 
amples and pertinent, often humorous, anecdotes and 
illustrations 


%& bases the material upon the probability approach and 
includes such modern topics as managerial and 
quality control, electronic computers, and sampling 
and bias 


% offers numerous interesting classroom-tested exercises 


Ready Spring 1957 


Distributors in Conodoe The MW. oe Ve 
Brett-Macmitcan Lro. ; Con yrany 


Se HOUICEES ADAG, TERED 80 60 FIFTH AVENUE, NEW YORK 11, N.Y. 











Please mention the Journal of the Amentcan Statistica, Association in writing advertisers 








mathematicians 


For Analysis Group of expanding research and de- 
velopment laboratory. Principal fields of interest 
are: weapons systems analysis, peacetime applica- 
tions of atomic energy, and operations research. 
Several openings are available. 


1 To carry out studies in OPERATIONS RE- 
SEARCH. Familiarity with probability theory, 
linear programming, game theory, information 
theory, optimization procedures, and other OR 
techniques very desirable. 


To perform operational analyses requiring ex- 
tensive background in aerodynamics and ex- 
terior ballistics. May also investigate problems 
dealing with missile fire control and navigational 
systems. Familiarity with digital computer pro- 


gramming desirable. 


To conduct investigations in the fields of elec- 
tromagnetic theory, acoustics, thermal and 
radiation effects. 


These openings require men with vision and initia- 
tive. Our modern laboratory provides a profes- 
sional working atmosphere and the location in a 
quiet suburban area provides pleasant living and 
working with easy access to the cultural and educa- 
tional facilities of New York City. 


All inquiries in confidence. Please send resume, in- 
cluding salary desired, to Personnel Manager. 


VITRO veceatonss 


Division of Vitro Corp. of America 


200 Pleasant Valley Way 
West Orange, New Jersey 














Please mention the Journal of the Ammnican Statistica, Association in writing advertisers 





‘OPPORTUNITIES IN 
OPERATIONS RESEARCH 
for STATISTICIANS and MATHEMATICIANS 
vacancies at THE Laer HOPKINS UNIVERSITY 


7100 Connecticut Avenue, Chevy Chase 15, 














**IMPROVING THE QUALITY OF STATISTICAL SURVEYS” 
is a collection of eight papers contributed as a memorial to Samuel Weiss. Each paper 


treats of some aspect of Federal statistics and the steps being undertaken to improve 
the quality of the results. Most of these undertakings have not been described elsewhere. 
The papers included in this volume are: 








Foreword—William G. Cochran 
Tax Returns as a Source of Benchmark Statistics—Helen F. Demond 


Use of a Sample Survey for Estimating an Aggregate Quarterly Financial State 
ment for a Population of Corporations—Dorothy M. Gilford and Charles L. Marks 


Non-Sampling Errors in Agricultural Surveys—Walter A. Hendricks 


New Measures of E ic Fluctuations, Preliminary Comments and Illustrations— 
Julius Shiskin 


A Proposed Study for Extending the Scope and Improving the Quality of Mortality 
Data—wW. Haenszel, I. M. Moriyama and M. G. Sirken 


Controlling Quality in Railroad Traffic Statistics—R. Tynes Smith III 


Some Problems in the Statistical Measurement of Chronic Disease—T. D. Woolsey 
and H. Nisselson 


Some Notes on a Study of Response—Dudley E. Young 





Price: $1.00 per copy; 95 pp., paper cover 
Send your order to: 
American Statistical Association 


1757 K Street, N.W. 
Washington 6, D.C. 





Please mention the Journal of the Amenican Statistica Association in writing advertisers 








ALBANY 
AUSTIN 
Boston 
Burra.o-NIAGARA 
CENTRAL INDIANA 


CENTRAL New JERSEY 


CuHIcago 
CLEVELAND 
CoLuMBUs 
CoNNECTICUT 
DAYTON 
DENVER 


Detroit 


Hawa 
ILLINOIS 


ITHACA 
MILWAUKEE 
MONTREAL 


New ORLEANS 
New Yor«e 


Norra CAROLINA 


Norts Texas 
Ox.ianoma City 


PHILADELPHIA 
PITTsBURGH 
Purrto Rico 
Rocuzster, N.Y. 


SACRAMENTO 
San FRANcIscO 


SouTHERN CALIFORNIA 
Srare Couiuzcs, Pa. 


Sr. Louis 
TULSA 
VIRGINIA 


Wasutneton, D.C, 


CHAPTER PRESIDENTS 


Basil Y. Scott, 2 Summit St., Rensselaer, New York 

John H. Hargrove, 2005 Raleigh, Austin, Texas 

Eugene W. Pike, 10 Churchill Lane, Lexington, Massachusetts 

A. M. Lilienfeld, 80 Delham Ave., Buffalo 16, New York 

Virgil L. Anderson, Statistical Laboratory, Purdue University, 
West Lafayette, Indiana 

Martin B. Wilk, Department of Mathematics, Fine Hall, Princeton, 
New Jersey 

Elizabeth J. Slotkin, 5512 Woodlawn Avenue, Chicago 37, Illinois 


Russell I. Haley, American Greetings Corp., 1300 West 78th St., 
Cleveland 2, Ohio 

William M. Duffus, c/o Dr. M. V. Condoide, Hagerty Hall, The 
Ohio State University, Columbus 10, Ohio 

James Tobin, Yale University, New Haven, Connecticut 

Max Astrachan, 1513 Cory Dr., Dayton, Ohio 

George E. Bardwell, University of Denver, 1446 Cleveland Place, 
Denver 2, Colorado 

Wallace W. Gardner, School of Business Administration, Uni- 
versity of Michigan, Ann Arbor, Michigan 

Frederick 8. W. Loo, 2141 Aupuni Street, Honolulu 17, Hawaii 

Walter C. Jacob, Dept. of ‘Agronomy, University of Illinois, 
Urbana, Illinois 

C. R. Henderson, Department of Husbandry, Cornell University, 
Ithaca, New York 

William A. Golomski, Instructor in Mathematics, Marquette Uni- 
versity, Milwaukee, Wisconsin 

Charles 8. Carter, Bell Telephone Company of Canada, 1050 
Beaver Hell Hill, Montreal, Quebec, Canada 

Roland Pertuit, 4871 Metropolitan Drive, New Orleans, Louisiana 

Robert E. Johnson, Western Electric Co., 195 Broadway, New 
York 7, N.Y. 

Gertrude M. Cox, Institute of Statistics, Box 5457, State College 
Station, Raleigh, North Carolina 

Albert W. Wortham, 3919 Pyka Drive, Dallas, Texas 

Richard W. Poole, Oklahoma City Chamber of Commerce, Skirvin 
Towers Hotel, Oklahoma City, Oklahoma 

Dorothy 8. Thomas, 118 South Van Pelt Street, Philadelphia 3, 
Pennsylvania 

Donovan J. Thompson, Graduate School of Public Health, Univer- 
sity of Pittsburgh, Pittsburgh 17, Pennsylvania 

Luz M. Torruellas, Puerto Rican Ecanomic Association, P.O. Box 
2003, University Station, Rio Piedras, Puerto Rico 

S. Lee Crump, Atomic Energy Project, P.O. Box 287, Station 8, 
Rochester 20, New York 

Wilbur L. Parker, 360 Sandburg Dr., Sacramento 19, California 

Helen Nelson, Div. of Labor Statistics & Research, Calif. Dept. of 
Industrial Relations, P.O. Box 965, San Francisco, California 

John A. Scott, 1417 Oak St., Santa Monica, California 

James B. Bartoo, Pennsylvania State College, State College, Penn- 
sylvania 

Arthur C. Meyers, Jr., 3674 Lindell, St. Louis 8, Missouri 

Robert Spears, Oklahoma A & M College, Stillwater, Oklahoma 

John E. Freund, Virginia Polytechnic Institute, Dept. of Stattstics, 
Blacksburg, Virginia 

Homer Jones, 3067 Ordway Street, N.W., Washington 8, D.C. 








ALBANY 


AvustIn 
Boston 


Buorraro-NIAGARA 


Cunrrat INDIANA 


Crntrau New Jersey 


Cxricago 
CLEVELAND 


CoLumsus 
Connecricur 


Darton 
Denver 


Derxoir 
Hawau 
IniiINoIs 
ITHACA 
MILWAUKEE 
MonrTREAL 


New Ox.zans 


New Yorre 
Norta CAROLINA 


Norrs Tzxas 


OxnaHoma Ciry 
PHILADELPHIA 


PirrspurGH 
Puszrto Rico 
Rocaxstser, N.Y. 


SACRAMENTO 
San Francisco 


SovrTimern Canironnia 
Sratr CoLnmag, Pa. 


Sr. Louts 
TuLsa 


Vineinia 


Wassineton, D.C. 




























ersey tay :* ah ; 
Mary T. Petty, Federal Reserve Bank of Chicago, Chicago, Illinois 
Arthur 8. Littell, School of Medicine, Western Reserve University, 
Cleveland 8, Ohio : . 
Mikhail V. Condoide, 188 West 10th Avenue, Columbus 1, Ohio 
Royal A. Crystal, Connecticut Medical Service Ind, P.O. Boo - 


‘onnecticut 
thn Sake High: St., Yellow Springs, Ohio 
Survey Statistician, Air Fores "pia Center, 
38800 ey Street, Denver, Colorado 


Gordon Frazier, $877 Lurline Drive, Honolulu, Hawait 

Frederick Willisms, 109 West Penneyloania, Urbana, Illinois 

Philip J. McCarthy, New York State School of Industrial and Labor 
Reiations, Cornell University, Ithaca, New York © : 

Joseph V. Talacko, Dept. of Mathematics, Marquette University, 
Milwauke: 3, Wisconsin 

Kenneth E. Vroom, Pulp and Paper Research Institute, $420 
University St., Montreal 2, Canada 

Elsie M. Watters, School of Business Administration, Tulane — 
University, New Orleans, Louisiana 

John M. Firestone, 6454 Syloan Ave., New York 71, N.Y. 

oa J. Monroe, State College Station, Boz 6467, " Redeigh, North 

rolina 

Stewart F. Mitchell, Allstate Insurance Co., 212 MN. St. Paul,» 
Dallas, Texas “1 

Elsie Lee Brown, 488 N.W. 25th, Oklahoma City 8, Oklahoma ell 

John H. Norton, Statistics Dept., Dietrich ma Unio. of Penn= — 
sylvania, Philadelphia 4, Pc. ee 

Herbert Ginsburg, Materials Engineering Dept., Westinghouse — 
Eleciric Corp., BE. Pittsburgh, Pennsywania 

Eric Cumpiano, Economic Development Administration, Santurce, 
Puerto Rico 

Jack Karger, 210 Eust Hickory St., East Rochester, New York 

Carl M. Frisen, 1570 Castec Dr., Sccramento 21, California 

Miss Phillis Beattie, U. S. Bureau of Labor Statistics, 680 Sansome 
Street, Room 80%, San Francisco 11, California 

Charles I. Landenherger, 965 Coronado Dr., Glendale 6, California 

pero is Brandow, 312 East Mitchell Aves, State College, Penn- 


Gate) Little, c/o Southwestern Bell Telephone Co., 1010 Pine St., 
Si. Louis 1, Missouré 
Mitton F. Searl, Stanolind Oil and Gas Company, P.O. Boz 691, 
Tulsa, Oklahoma 
Clyde Y. apa Dept, of Statistics, Virginia Polytechnic Inst., 


Blacksburg, Virginia 
Harold Wool, Ofice of Assistant or of Defense (MPOR), 
0.8.D. Pentagon, Washington 25, D 

















APPLIED GENERAL STATISTICS, 2nd Ed. 


FREDERICK E. CROXTON, Columbia University 
DUDLEY J. COWDEN, University of North Carolina 


Revised, reorganized, and largely re- 
written for clarity and brevity, this new 
edition retains the well-known thorough- 
pess and completeness of the original in 
substantially less bulk. It is alsc more 
logically organized for ready use of 19 
chapters in a short course if desired. 


Other improvements: Clurification of 
some sytabols and chapter vocabularies 
of symbols; simplified explanations; ex- 
panded topics; infrequently used ma- 
terials dropped; fresh iilustrstions in- 


corporated and about 150 new examples 
added; confusion of terms eliminated; 
revision of some proofs of formulas in the 
appeodix demonstrations, © 


Expanded topics inclade sampling, corre- 
lation of time series, analysis of variance, 
non-linear correlation, confidence limits 
(of arithmetic means, proportions, yari- 
ance and correlation coefficients) and sig- 


nificance of differences. 


843 pages - 6"x9” + Published 1955 


WORKBOOK IN APPLIED GENERAL STATISTICS, 4th Ed. 


FREDERICK E. CROXTON, Coiumbia University 


30 carefully worked out exercises cover- 
ing the elementary portions of Croxton 
& Cowden's new Second Edition chove, 
with spaces for answers and appropriate 
work sheets and tabular forms. Based on 
teal data, provides ample lab materials 


for a year’s work, adaptable to shorter use 
by omitting steps or supplying answers in 
part. 

Answer: and Instructions (76 pages) 
available on adoption (restricted). 

127 pages * 8%” x 11” - Pub. Aug. 1956 


ania 





Prentice - 


GEORGE BAMTé SOMPANY, INC., MENASHA, WISCONSIN, U.3.A. 





